
Senior ML Ops Technical Specialist - DevOps, Python
About the role
Job Summary
This role is responsible for architecting and delivering robust machine learning operations (ML Ops) solutions, driving automation and scalability across ML pipelines and DevOps practices. The individual provides strategic technical leadership, ensures adoption of industry best practices, and advances the organizationâs capabilities in deploying, monitoring, and maintaining ML models in production environments.
Key Responsibilities
-
Architect and implement ML Ops solutions using Python, MLflow, Kubeflow Pipelines, and TFX to enable scalable and automated machine learning workflows.
-
Design and integrate CI/CD pipelines with Jenkins, GitLab CI/CD, CircleCI, and GitHub Actions for continuous deployment of ML models and data pipelines.
-
Develop infrastructure-as-code templates using Terraform and AWS CloudFormation to provision and manage cloud resources for ML workloads.
-
Establish monitoring and logging frameworks with Prometheus, Grafana, ELK Stack, and Fluentd to ensure model performance, reliability, and traceability in production.
-
Lead the adoption and optimization of DevOps practices for ML systems, leveraging Ansible, Bash, and PowerShell for automation and environment management.
-
Serve as a subject matter expert, mentoring team members on ML Ops tools, cloud technologies, and best practices, and conducting technical training sessions.
-
Collaborate within the team to analyze project requirements, recommend innovative ML Ops solutions, and ensure alignment with organizational and client objectives.
-
Contribute to competency development by creating technical whitepapers, analyzing market trends, and building reusable solution assets for ML Ops.
Skill Requirements
-
Expert Proficiency In Ml Ops, Including Designing And Managing Endtoend Ml Pipelines.
-
Excellent Knowledge Of Python For Automation, Scripting, And Ml Model Integration.
-
Expertlevel Experience With Mlflow, Kubeflow Pipelines, Tfx, And Metaflow For Workflow Orchestration.
-
Advanced Proficiency In Devops Tools Such As Jenkins, Gitlab Ci/Cd, Circleci, Github Actions, And Ansible For Pipeline Automation.
-
Excellent Skills In Infrastructureascode Using Terraform And Aws Cloudformation.
-
Expert Understanding Of Monitoring And Logging Tools Including Prometheus, Grafana, Elk Stack, And Fluentd.
-
Excellent Command Of Version Control Systems: Git, Github, Gitlab, Bitbucket.
-
Advanced Knowledge Of Cloud Platforms (Aws, Azure, Gcp) For Ml Deployment And Scaling.
-
Excellent Ability To Mentor, Train, And Guide Technical Teams In Ml Ops Best Practices.
Other Requirements
-
Optional but valuable:
-
AWS Certified Machine Learning � Specialty
-
-
Google Professional Machine Learning Engineer
-
-
Hashi Corp Certified: Terraform Associat
Required skills
MLOps
Python
DevOps
CI/CD
About HCL Technologies
Noida
Headquarters