HCL Technologies
HCL Technologies

Senior Technical Lead

RoleInfrastructure
LevelSenior
LocationGautam Buddha Nagar, India
WorkOn-site
TypeFull-time
Posted2 days ago
Apply now

About the role

Job Summary

Senior Cloud Platform Engineer:

About the role

You will own the reliability, security, and scalability of our GCP-based AI platform infrastructure. Everything runs on Cloud Run, managed via Terraform, deployed through Cloud Build. You are responsible for zero-downtime deployments, cloud cost control, end-to-end observability, and ensuring that IAM, VPC, and data security posture meet enterprise standards. You are also the person the data and AI engineers call when their Terraform apply fails or their Cloud Run service won't start.

Key responsibilities
Own and evolve the Terraform IaC codebase — write and maintain reusable modules for Cloud Run services, AlloyDB clusters, Spanner instances, Big Query datasets, Memorystore Redis, Vertex AI endpoints, Artifact Registry, and VPC networking
Manage Cloud Build CI/CD pipelines across all services — branching strategy (Git Ops), build triggers, test gate enforcement, multi-environment promotion (dev → staging → prod), and automated rollback on failed health checks
Design and maintain GCP security posture — IAM least-privilege service accounts, Identity-Aware Proxy (IAP) for all internal services, VPC Service Controls, Private Service Connect for AlloyDB and Redis, and Secret Manager integration
Build and maintain the full observability stack — Cloud Monitoring dashboards, Open Telemetry collector configuration, structured JSON logging standards, distributed tracing across FastAPI and Lang Graph services, and Pager Duty or equivalent on-call alerting
Define and track SLOs for all platform services — API p50/p95/p99 latency, data pipeline freshness, AI pipeline throughput, Cloud Run error rate — and run monthly reliability reviews
Manage Docker image strategy — multi-stage build patterns to minimise image size, distroless base images, Artifact Registry lifecycle policies, and automated vulnerability scanning with Container Analysis
Implement Fin Ops practices — Big Query slot monitoring and reservation management, Cloud Run CPU/memory right-sizing, committed use discount planning, and per-team cost allocation using labels
Conduct quarterly infrastructure security reviews and respond to GCP Security Command Center findings

Must-have skills

Terraform — write modules from scratch, not just modify existing ones; HCL fluency, remote state backends (GCS), workspace management, and Terraform Cloud or Atlantis for Git Ops CI/CD integration
GCP — 3+ years hands-on production experience: Cloud Run, Big Query, Cloud Build, IAM, VPC networking, Cloud Monitoring, Secret Manager, Artifact Registry; GCP Associate Cloud Engineer or Professional DevOps Engineer certification strongly preferred
Docker — multi-stage builds, layer caching optimisation, distroless base images, image security scanning, and Artifact Registry management
CI/CD — Cloud Build or GitHub Actions: pipeline design from scratch, artifact versioning, environment-specific config management, and deployment gating strategies
Linux / bash — comfortable debugging inside running containers, writing shell automation scripts, managing file permissions and system resources
GCP networking — VPC design, subnet allocation, firewall rules, Private Service Connect, Cloud NAT, and DNS configuration for private service endpoints

Key Responsibilities

null

Skill Requirements

null

Other Requirements

Good to have
Open Telemetry — collector configuration, exporter setup (Cloud Trace, Prometheus), and custom instrumentation for Python FastAPI services
Kubernetes / GKE — even if the current stack is Cloud Run, GKE knowledge is valuable for future scale requirements
Python scripting for infrastructure automation — Cloud Functions, custom Cloud Build steps, GCP Admin SDK scripts
Cloud cost management tooling — Looker Studio billing dashboards, Budget Alerts, committed use planning models, and Big Query billing export analysis
Azure networking basics — enough to understand the cross-cloud connectivity between Azure Databricks and GCP services
GCP Security Engineer certification or equivalent security background

Required skills

GCP

Terraform

Cloud Run

Cloud Build

IAM

Monitoring

OpenTelemetry

SLOs

About HCL Technologies

Gautam Buddha Nagar

Headquarters