
Azure Senior Data Lead
About the role
Job Summary
Key Responsibilities:
-
Design and implement data pipelines using best practices and industry-leading tools like Databricks and Azure Data Factory (ADF)
-
Extract, transform, and load large datasets from various sources, ensuring data quality and integrity
-
Utilize Python and Spark to perform complex data manipulations and aggregations
-
Write optimized SQL queries to interact with relational databases and Cosmos DB
-
Integrate data pipelines with APIs and external systems using efficient methods
-
Monitor and maintain data pipelines, ensuring smooth operation and identifying potential issues
-
Collaborate with data scientists, analysts, and engineers to understand data needs and deliver valuable insights
Key Responsibilities
-
Lead the end-to-end technical implementation of data projects using azure data factory, azure databricks, sql, oracle pl/sql, and python.
-
Design and develop efficient and reliable etl processes for large datasets.
-
Collaborate with cross functional teams to understand business requirements and translate them into technical solutions.
-
Optimize data workflows, troubleshoot issues, and ensure data quality and integrity.
-
Implement best practices for data security, governance, and compliance.
-
Provide technical guidance, mentoring, and support to junior team members.
-
Stay uptodate with the latest trends and technologies in data engineering and analytics.
Skill Requirements
-
3 to 7 years of experience with IT / Azure, Python, SQL, Kafka, NoSQL, .NET C#, Snowflake, IICS
-
Bachelor’s degree in computer science, Information Technology, or related STEM fields, or equivalent experience.
-
Good understanding Massively parallel processing (MPP) systems, experience building Datawarehouse/Data Mart on Azure Synapse SQL pools (SQL DW)
-
Strong SQL skills and experience writing complex yet efficient SPROCs/Functions/Views using T-SQL
-
Solid understand of spark architecture and experience with performance tuning big data workloads in spark
-
Building complex data transformations on both structure and semi-structured data (XML/JSON) using Pyspark & SQL, refactoring tradition ML model to run on spark framework
-
Familiarity with Cognitive Search/Elastic search and its use cases & building integrations to load data to Search services
Familiarity with Azure Databricks environment and deploying spark code in databricks cluster
-
Good understanding of No SQL and its use case, Modelling No SQL schemas & containers, building integration to read/write to cosmos
-
Good understanding on distributed systems and experience building real-time integrations with Kafka
-
Good understanding of Azure cloud ecosystem; Azure data certification of DP-200/201/203 will be an advantage
-
Proficient with Visual Studio 19+, IntelliJ/eclipse and source control using GIT
-
Good understanding of Agile, DevOps and CI-CD automated deployment (e.g. Azure DevOps, Jenkins)
-
Good knowledge on Microservices architecture and any experience in building microservices with .NET Core WebAPI will be an advantage
-
Any experience building rest-full services using Python FAST API will be an advantage
-
Experience with Snowflake data platform, including data loading, transformation, and query optimization
Experience with Informatica Intelligent Cloud Services (IICS), including cloud data integration, ETL/ELT pipeline design, and connector configuration
Other Requirements
1.Relevant certifications in Azure Data Factory, Azure Databricks, SQL, Oracle, or Python would be a plus.
Pyspark, Kabana, sql and no sql. Good to have .net also
Required skills
Azure
Data Engineering
Data Leadership
About HCL Technologies
Hyderabad
Headquarters