refresh

Trending companies

Trending companies

Honeywell
Honeywell

Sr Advanced AI Data Engineer

RoleData Engineering
LevelSenior
LocationMonterrey, Mexico
WorkOn-site
TypeFull-time
Posted1 week ago
Apply now

About the role

As a Senior Advanced Data Engineer here at Honeywell, you will play a crucial role in designing, developing, and maintaining advanced data solutions that drive business insights and support decision-making processes. You will leverage your expertise in data engineering to build scalable data pipelines, optimize data storage, and ensure data quality and integrity.

Your ability to work with cross-functional teams and translate business requirements into technical solutions will be key to your success in this role.

In this role, you will impact the business by enabling data-driven decision-making, optimizing data processes, and improving overall data management. Your work will contribute to increased operational efficiency, cost savings, and enhanced customer satisfaction.

At Honeywell, our people leaders play a critical role in developing and supporting our employees to help them perform at their best and drive change across the company. Help to build a strong, diverse team by recruiting talent, identifying, and developing successors, driving retention and engagement, and fostering an inclusive culture.

YOU MUST HAVE

  • Databricks: 4+ years hands-on: Py Spark, Delta Lake, Workflows, Unity Catalog.
  • Demonstrate expertise in data strategy, for example: Medallion Architecture, Domain Data Modeling and Functional Data Architecture.
  • Data Quality Frameworks (i.e. rule-based validation, anomaly detection)
  • Data Pipelines: incremental loading, CDC, CI/CD, Observability
  • Advanced Python/Pyspark and Advanced SQL
  • Strongly preferred: DLT, UC, GCP, Azure, Kafka.
  • Highly value Databricks Certified Professional
  • 7+ years of overall data engineering experience
  • 4+ years of hands-on Azure Databricks experience in production environments
  • Proven experience building platforms, not just maintaining them: greenfield builds, migrations, framework development
  • Experience with financial, engineering, enterprise, or industrial-scale datasets preferred
  • Demonstrated ability to own technical decisions end-to-end: from architecture to production deployment

#LI**-Hybrid

AI-Ready Data Platform

  • Design and implement end-to-end ingestion pipelines from heterogeneous sources: including Snowflake, SQL Server, Excel, REST APIs, and unstructured data: into Azure Databricks
  • Architect and enforce Medallion Architecture (Bronze → Silver → Gold) ensuring data arrives clean, validated, and fit for purpose at each layer
  • Build Delta Live Tables (DLT) pipelines with declarative data quality expectations, schema evolution, and automated lineage tracking
  • Implement incremental loading patterns using CDC (Change Data Capture), watermarking, and Delta Lake MERGE/UPSERT for efficient, scalable ingestion
  • Enable structured and unstructured data processing: documents, Excel files, JSON, Parquet : building the foundation for AI and ML consumption

Data Modeling & Semantic Layer

  • Design and implement the Engineering data model: dimensional models, fact/dimension tables, and domain-specific data marts: serving analytics, BI, ML and AI use cases
  • Build a governed, reusable semantic layer on top of the Gold layer, enabling self-service analytics through Power BI and GCP-connected consumers
  • Ensure data models are documented, versioned, and aligned to business domains within the VECE COE

Orchestration and Data Ops

  • Build and manage Databricks Workflows with multi-task dependencies, SLA monitoring, retry logic, and alerting
  • Implement CI/CD pipelines for Databricks using Azure DevOps and GitHub Actions : including Python Wheel packaging for reusable utility libraries deployed across the platform
  • Apply software engineering best practices: version control, unit testing, modular code design, and automated deployment to Dev/QA/Prod environments
  • Cluster right-sizing, DBU management, Delta table optimization (VACUUM, compaction), cost monitoring across Azure Databricks and GCP

Data Governance & Quality

  • Implement and manage Unity Catalog for centralized data governance: three-level namespace (catalog → schema → table), fine-grained RBAC, data masking, and audit logging
  • Build data quality frameworks: rule-based validation, deduplication, reconciliation, and anomaly detection: ensuring data arrives fit for AI/ML consumption
  • Establish data lineage tracking across ingestion, transformation, and serving layers
  • Govern data delivery to GCP: ensuring secure, validated, schema-consistent outputs consumed by downstream data science and analytics teams

AI & Proactive Analytics Foundation

  • Design pipelines that are AI-ready from day one: supporting structured ML feature pipelines, embedding generation, and future Vector DB integrations
  • Build the data infrastructure that enables the shift from descriptive dashboards to proactive, predictive analytics
  • Collaborate with Data Scientists and Analytics Engineers to ensure the Gold layer supports model training, feature stores, and real-time inference pipelines

Required skills

Data engineering

Data pipelines

Data storage

Data quality

Analytics support

About Honeywell

Monterrey

Headquarters