Apple
Apple

ML Infrastructure Service Reliability Engineer- Apple Services Engineering

RoleMachine Learning
LevelSenior
LocationBengaluru, India
WorkOn-site
TypeFull-time
Posted2 weeks ago
Apply now

About the role

At Apple, we don’t just build products — we create transformative experiences
that have reshaped entire industries. Our innovation is driven by the diversity of
our people and their ideas, inspiring everything we do. Imagine the impact you
could make. Join Apple and help us leave the world better than we found it.

The ML Infrastructure team is responsible for managing Apple’s largest ML
compute platform, multi-cloud storage abstraction and caching platform, which
supports critical machine learning training workloads that power user-facing
features across the Apple ecosystem. Operating across both first-party and
third-party cloud environments brings complex and unique challenges.

As a Site Reliability Engineer (SRE) on the ML Infrastructure team, you’ll be
expected to address these challenges through a strong foundation in cloud
object storage, data analysis, automation, collaboration, and advanced
expertise in Kubernetes. Our team oversees the full infrastructure stack — from
low-level nodes to the complete network architecture — ensuring our platform
remains highly available, resilient, and efficient at scale.

Required skills

Site Reliability Engineering

Machine Learning Infrastructure

Cloud Computing

Automation

Distributed Systems

About Apple

Bengaluru

Headquarters