HCL Technologies
HCL Technologies

Sr Administrator (Support & Operations)

RoleDevops
LevelSenior
LocationBengaluru, India
WorkOn-site
TypeFull-time
Posted2 days ago
Apply now

About the role

Job Summary

As a Platform Engineer, you are responsible for ensuring the stability, performance, and automation of the cloud platform’s core services, including API automation layers, observability components, CI/CD workflows, IaC toolchains, and QA/Documentation systems.

Key Responsibilities

  • Core Responsibilities 1. Platform Operations & Reliability (Run Engineering) • Operate and maintain key platform services such as the Terraform Registry, Tracing infrastructure, SGCP Quality & Observability resources, and documentation & chat support systems.
  • Ensure availability, performance, resilience, and secure lifecycle management for all production components.
  • Perform patching, upgrades, and vulnerability remediation, aiming for minimal human intervention on production systems.
  • Lead incident response, perform deep root cause analysis, and implement long term corrective actions.
  • Reduce operational toil through automation, workflow industrialization, and proactive reliability engineering. 2. CI/CD & Delivery Platforming • Operate and evolve the cloud platform’s CI/CD pipelines and reusable workflows used by ~300 developers.
  • Manage the lifecycle of base Docker images: security hardening, automated build pipelines, versioning, and distribution.
  • Maintain and extend the platform’s IaC toolchain, including Terraform workflows, deployment pipelines, and registry management.
  • Continuously improve delivery performance, deployment reliability, and overall developer experience.
  • Contribute to the technical roadmap with an engineering driven mindset. 3. Observability Engineering • Maintain and enhance the cloud platform’s observability stack across traces, and dashboards.
  • Ensure full visibility into system behaviour, performance drifts, errors, and capacity indicators.
  • Build automation for alerting, anomaly detection, and platform health insights, improving signal quality and reducing noise.
  • Support SRE practices to strengthen platform reliability through data driven insights. 4. User Support & Platform Adoption • Participate in system demos, validation sessions, and operational readiness reviews.
  • Act as a partner for SG Cloud engineering teams in troubleshooting and platform enablement.

Skill Requirements

  • Core Responsibilities 1. Platform Operations & Reliability (Run Engineering) • Operate and maintain key platform services such as the Terraform Registry, Tracing infrastructure, SGCP Quality & Observability resources, and documentation & chat support systems.
  • Ensure availability, performance, resilience, and secure lifecycle management for all production components.
  • Perform patching, upgrades, and vulnerability remediation, aiming for minimal human intervention on production systems.
  • Lead incident response, perform deep root cause analysis, and implement long term corrective actions.
  • Reduce operational toil through automation, workflow industrialization, and proactive reliability engineering. 2. CI/CD & Delivery Platforming • Operate and evolve the cloud platform’s CI/CD pipelines and reusable workflows used by ~300 developers.
  • Manage the lifecycle of base Docker images: security hardening, automated build pipelines, versioning, and distribution.
  • Maintain and extend the platform’s IaC toolchain, including Terraform workflows, deployment pipelines, and registry management.
  • Continuously improve delivery performance, deployment reliability, and overall developer experience.
  • Contribute to the technical roadmap with an engineering driven mindset. 3. Observability Engineering • Maintain and enhance the cloud platform’s observability stack across traces, and dashboards.
  • Ensure full visibility into system behaviour, performance drifts, errors, and capacity indicators.
  • Build automation for alerting, anomaly detection, and platform health insights, improving signal quality and reducing noise.
  • Support SRE practices to strengthen platform reliability through data driven insights. 4. User Support & Platform Adoption • Participate in system demos, validation sessions, and operational readiness reviews.
  • Act as a partner for SG Cloud engineering teams in troubleshooting and platform enablement.

Other Requirements

Key Skills & Competencies Technical Skills • Strong experience with CI/CD tooling (Github Action/GitLab CI, Jenkins) • Solid expertise in Infrastructure as Code—Terraform, Ansible preferred • Hands on experience with platform automation, scripting/coding (Python), and workflow orchestration • Proficiency in containerized environments (Docker / Kubernetes, registries, build pipelines) • Understanding of monitoring and observability at scale (metrics, logs, traces) Engineering Mindset • Reliability first mindset with strong operational discipline • Ability to automate, industrialize, and eliminate manual processes • Strong troubleshooting capabilities across distributed systems • Clear communication and collaborative problem solving across global teams

Benefits and perks

Learning Budget

Required skills

Platform engineering

CI/CD

IaC

Observability

Terraform

Docker

Incident response

RCA

About HCL Technologies

Bengaluru

Headquarters