HCL Technologies

Domain Architect - IBM FlashSytem Storage, Cisco Switch

RoleInfrastructure

LevelLead

LocationBengaluru, India

WorkOn-site

TypeFull-time

Posted2 days ago

Apply now

About the role

Job Summary

Own the reliability, availability, and performance of production NAS and/or Object Storage services. Apply SRE principles to storage platforms: define reliability goals, improve observability, and reduce manual operational work through automation. Design and build automation and Infrastructure‑as‑Code to manage storage systems at scale. Lead troubleshooting and resolution of complex storage incidents; participate in on‑call and incident response. Perform capacity planning, forecasting, and demand modeling to support business growth. Partner with engineering teams to support application onboarding, testing, and production readiness. Contribute to global storage initiatives, including lab and infrastructure deployments. Create and maintain runbooks, documentation, and operational best practices to improve team efficiency. What We’re Looking For 8+ years of experience in SRE, infrastructure automation, or platform engineering, with strong storage exposure. Hands‑on experience operating NAS and/or Object Storage platforms, cluster/Ceph in production. Strong proficiency with automation and IaC tools (e.g., Ansible, Terraform, Puppet, Salt Stack). Experience running highly available, scalable systems in 24×7 environments. Familiarity with containers and orchestration (Docker, Kubernetes). Experience with CI/CD pipelines, monitoring, logging, and version control systems (Git, Perforce). Strong incident management, troubleshooting, and communication skills. Bachelor’s degree in Computer Science, Engineering, or a related field. Nice to Have Experience with large‑scale distributed systems. Strong understanding of SRE concepts such as SLIs, SLOs, error budgets, observability, and logging. Ability to debug and optimize infrastructure and automate repetitive workflows. Proven ability to work independently and deliver results as a contractor in a global team environment. Why This Role Apply SRE practices to storage systems at scale Work on mission‑critical infrastructure High impact, ownership‑driven role Opportunity to influence reliability and operational maturity across teams
Job Description : CI/CD pipeline

Key Responsibilities

We are looking for a Systems Storage Site Reliability Engineer (SRE) to support and scale our global storage platforms. This is a contractor position focused on applying SRE principles to storage systems—improving reliability, reducing operational toil, and enabling sustainable growth through automation and observability. You will work at the intersection of storage engineering and reliability engineering, partnering closely with infrastructure and application teams to operate production systems at scale. Automation tool , IAAC tools

Skill Requirements

Storage , Netapp, Backup, SRE , platform Engineer, object storage, ceph/Cluster . Automation tool

Other Requirements

Constrainers and orchestration

Required skills

SRE

Storage platforms

NAS

Object storage

Ansible

Terraform

Kubernetes

Incident management

About HCL Technologies

HCL Technologies

Bengaluru

Headquarters