
Data AI Architect
About the role
Key Responsibilities:
-
Data Architecture for AI
-
Architect AI data foundations including ingestion, transformation, enrichment, and serving layers
-
Design data architectures supporting RAG, embeddings, feature stores, and training data pipelines
-
Define standards for data quality, lineage, versioning, and governance for AI workloads
-
Ensure data platforms support scalability, performance, and low latency AI use cases
-
Data Quality & Assurance
-
Architect data validation and testing frameworks for AI and analytics systems
-
Enable automated validation for data correctness, drift, bias, and completeness
-
Define test strategies for data migration, data transformation, and AI readiness
-
Collaborate with QE teams to embed data assurance into pipelines and platforms
-
Platform & Integration
-
Integrate data platforms with AI services and analytics tools
-
Define secure access patterns for data used in training, inference, and evaluation
-
Enable observability for data pipelines and AI data consumption
-
Guide teams on best practices for AI enabled BI and data driven systems
-
Core Platforms, Frameworks & Tooling
-
LLM and foundation model platforms (e.g., AWS Bedrock, Azure OpenAI, Vertex AI)
-
Agentic AI and orchestration frameworks (Lang Chain, Lang Graph, CrewAI, Auto Gen, Google ADK or equivalent)
-
CI/CD and MLOps tooling for AI pipelines (GitHub Actions, Azure DevOps, Jenkins)
-
Data ingestion and processing platforms (Spark, Kafka, cloud native ETL/ELT frameworks)
-
Data quality and validation frameworks (Great Expectations, Amazon Deequ, custom reconciliation frameworks)
-
Feature stores and embedding pipelines (Feast, embedding generation pipelines, vector databases)
-
Data drift, bias, and consistency monitoring tools (Evidently, statistical data quality monitors)
-
Metadata, lineage, and governance platforms (Data Hub, Apache Atlas, cloud data catalogs)
-
AI enabled analytics and Generative BI platforms (Power BI with Copilot, semantic layers, NLQ enabled BI)
-
Cloud native data platforms and storage (object storage, distributed query engines, data lakehouses)
-
Client Orientation & Leadership
-
Partner with product and engineering teams to identify Data for AI opportunities and shape roadmaps
-
Support client workshops, RFPs, and solution presentations
-
Mentor engineers on AI/ML/Gen AI best practices and emerging technologies
-
Translate complex AI concepts into business-friendly narratives
-
Must Have Qualifications
-
13+ years of experience in software engineering with 3+ years in AI with strong architecture ownership
-
Strong expertise in data engineering, data quality, and data governance
-
Experience supporting AI use cases such as RAG, feature engineering, and model training
-
Proficiency with data platforms, cloud services, and distributed data systems
-
Solid understanding of QE practices related to data validation and testing
-
Good to Have Skills
-
Experience with Generative BI or AI assisted analytics
-
Knowledge of metadata management, lineage tools, and data observability
-
Exposure to AI ethics and bias in data sets
-
Cloud data certifications
Education: Bachelor of Engineering
- Preferred skills: Technology->Machine Learning->Generative AI->retrieval augmented generation (rag),Technology->Data Engineering->Databricks,Technology->Data Engineering->Palantir Foundry,Technology->Data Management->Data Architecture->Data Architecture
- Data Modeling,Technology->Embedded Software->Matlab,Technology->Agile Testing->Agile Testing
- ALL->CD/CI,Technology->Integration->Confluent Kafka,Technology->Big Data
- Data Processing->Spark
About Infosys
BANGALORE
Headquarters