
Technical Lead
About the role
Job Summary
Monitoring
-
Monitoring of AWS (Azure – advantage)infrastructures using Data Dog (or equivalent) using KPIs.
-
Proven experience defining efficient alerts, synthetic tests, analyzing logs (error detection), detecting issues using Data Dog, managing SLIs and SLOs, leveraging NOC activity, and defining operational flows.
Architecture Understanding
-
Infrastructure:
In-depth understanding of designing distributed systems in cloud-based environments and microservices, and familiarity with 12-factor app methodology. -
Business Logic:
Understand complex cloud product architectures, including event-driven architecture, with a focus on how data flows and messages interact between services.
Continuous Improvement & Documentation
-
Develop and maintain technical documentation for processes, procedures, and systems.
-
Implement preventive measures and follow RCA principles for improvement as part of post-incident reviews and incident management processes.
-
Promote a “Fix it twice” mentality — resolving the immediate issue and ensuring it does not happen again through improvement or automation.
-
Facilitate the adoption of SRE best practices across engineering squads through mentorship and standardized tooling.
Infrastructure & Cloud
-
Proven experience with AWS services such as:
-
API Gateway, Lambda Functions, SQS, SNS, S3 Bucket, RDS, Redis Cache, Kinesis, Global Accelerator, CloudFront, Route 53, IoT Core, Kubernetes/EKS.
-
Understanding of common cloud services in production environments and Infrastructure as Code (IaC)using Terraform and Cross Plane.
Automation and CI/CD
-
Experience with Azure DevOps, GitHub Actions, Argo, ArgoCD/Git Ops, and artifact management using Artifactory.
-
Ability to review pipelines and Helm charts or equivalent, and understand automation processes.
Security (Preferred)
- Understanding of how to implement and maintain a secure product from both cloud infrastructure and application development perspectives.
Personal Requirements
- Bachelor’s degree in computer science or equivalent proven experience.
- 3+ years of experience in a hands-on** DevOps or SRE position**.
- Strong communication skills to align, document, and share knowledge across teams. Team player.
- Ability to work under high load and lead sensitive situations and investigations, especially when customer-facing services are impacted.
- Strong motivation for continuous learning and adoption of new technologies, with excellent problem-solving skills and a proactive approach.
Key Responsibilities
-
To be responsible for providing technical guidance / solutions ;define, advocate, and implement best practices and coding standards for the team.
-
To develop and guide the team members in enhancing their technical capabilities and increasing productivity
-
To ensure process compliance in the assigned module| and participate in technical discussions/review as a technical consultant for feasibility study (technical alternatives, best packages, supporting architecture best practices, technical risks, breakdown into components, estimations).
-
To prepare and submit status reports for minimizing exposure and risks on the project or closure of escalations.
Skill Requirements
null
Other Requirements
null
Benefits and perks
•Learning Budget
Required skills
Monitoring
Datadog
SLOs
Terraform
CrossPlane
AWS
Azure DevOps
CI/CD
About HCL Technologies
Hyderabad
Headquarters