HCL Technologies
HCL Technologies

Azure Senior Data Lead

RoleData Engineering
LevelSenior
LocationBangalore, India
WorkOn-site
TypeFull-time
Posted2 days ago
Apply now

About the role

Job Summary

Design and build optimization capabilities for Databricks - spanning Spark tuning, cluster
right-sizing, job orchestration, DBU consumption, and Delta Lake storage.

  • Translate platform expertise into product features - detection rules, recommendation
    engines, and safe automated actions for production environments.
  • Build POCs to validate optimization ideas, demonstrate value, and support pre-sales
    engagements.
  • Partner cross-functionally with backend, AI/ML, and data engineering teams to ship features
    end-to-end

Key Responsibilities

Design and build optimization capabilities for Databricks - spanning Spark tuning, cluster
right-sizing, job orchestration, DBU consumption, and Delta Lake storage.

  • Translate platform expertise into product features - detection rules, recommendation
    engines, and safe automated actions for production environments.
  • Build POCs to validate optimization ideas, demonstrate value, and support pre-sales
    engagements.
  • Partner cross-functionally with backend, AI/ML, and data engineering teams to ship features
    end-to-end

Skill Requirements

  • Engineering experience; hands-on exp in Databricks in production.
  • Apache Spark internals
  • Catalyst optimizer, Tungsten engine, AQE, DAG scheduler, shuffle
    behavior, partitioning, broadcast/sort-merge joins, data skew handling, and Spark 4.0
    capabilities.
  • Databricks platform depth
  • Delta Lake (transaction log, OPTIMIZE, ZORDER, vacuum, liquid
    clustering, schema evolution, time travel, CDC/merge), Lakeflow Declarative Pipelines, Unity
    Catalog (governance, lineage, fine-grained access), Photon engine, Databricks Workflows,
  • Lakebase, and all cluster types (job, all-purpose, serverless SQL, serverless compute).
  • Databricks REST API & SDK - programmatic management of clusters, jobs, permissions, and
    workspace configuration.
  • Performance tuning
  • Spark UI interpretation, physical plans, shuffle/skew/spill diagnosis,
    join optimization, caching strategies, and Photon adoption decisions.
  • Cost optimization
  • DBU forecasting, cluster sizing, autoscaling policies, spot vs. on-demand
    trade-offs, instance pools, job-vs-all-purpose decisions, predictive optimization, serverless
    economics (Performance vs. Standard mode, serverless GPU, egress, DBU trade-offs).
  • Advanced Python & expert SQL; deep Py Spark and Spark SQL internals.
  • Cloud platforms (AWS/Azure/GCP) - IAM, networking, storage (S3/ADLS/GCS), and cloudnative services underpinning Databricks.
  • Experience with Docker, Kubernetes, Terraform, and modern CI/CD pipelines.
  • Strong fundamentals in data structures, algorithms, distributed systems, and large-scale
    system design

MLflow, Mosaic AI ecosystem (Agent Framework, Agent Bricks, AI Gateway, Vector Search),
feature stores, Databricks SQL Warehouses, or Databricks Asset Bundles.

  • Fin Ops practices and cost-attribution models for data platforms.
  • Observability tools
  • Prometheus, Grafana, Open Telemetry, Datadog.
  • Contributions to open-source Spark/Delta/Databricks projects

Other Requirements

Databricks certifications a plus

BS/MS in Computer Science, Engineering, or related field

Required skills

Databricks

Apache Spark

Delta Lake

REST APIs

Performance tuning

About HCL Technologies

Bangalore

Headquarters