Jobs
Benefits & Perks
•Healthcare
•401(k)
•Equity
•Paid Time Off
•Parental Leave
•Commuter Benefits
•Healthcare
•401k
•Equity
•Parental Leave
•Commuter
Required Skills
Kubernetes
Docker
Python
Go
DevOps
MLOps
Cloud Architecture
Distributed Systems
Infrastructure as Code
Sr Director- Backend Engineering
Key Skills and Role Responsibilities:
This role is for a strategic and technical leader to define, build, and operate the infrastructure orchestration systems that power our organization's cutting-edge Artificial Intelligence (AI) initiatives. The Senior Director will lead a team responsible for ensuring a robust, scalable, cost-efficient, and high-performance platform for all stages of the AI lifecycle, from experimentation and training to deployment and inference.
Strategy and Leadership
Define and execute the long-term vision and roadmap for the company’s AI infrastructure Network Services, aligning it with overall business and AI Services goals.
Lead, mentor, and grow a high-performing engineering and operations team focused on AI infrastructure and platform engineering.
Manage budget and resource allocation for AI infrastructure Network Services deliverables.
Act as a key liaison between AI infrastructure and other services owners and consumers, core engineering, Cloud infrastructure, and executive leadership.
AI Infra Development and Operations
Oversee the design, implementation, and maintenance of the core network orchestration platforms for large-scale AI model training (e.g., distributed training, hyperparameter tuning) and deployment (e.g., containerization, serverless functions, edge deployment).
Ensure reliability, security, and compliance of the AI infrastructure, meeting strict standards for data governance and model integrity.
Establish Service Level Objectives (SLOs) and Key Performance Indicators (KPIs) for the AI platform services and lead efforts for continuous optimization and performance tuning.
Technology and Architecture
Select, evaluate, and integrate the core technologies required for the AI stack (e.g., Cloud Overlay/Under networking, Infiniband, Load Balancer, DNS, Core Networking, Kubernetes, Ray, GPU/accelerator management, distributed file systems).
Champion infrastructure-as-code (IaC) principles to manage and provision AI resources consistently and at scale.
Qualifications
Required
Education: Bachelor's or Master’s degree in Computer Science, Engineering, or a related technical field.
Experience:
15+ years of progressive experience in software engineering, infrastructure, or platform operations.
5+ years of experience leading and managing technical teams, ideally in a Director or Sr. Director level or equivalent capacity.
Deep, hands-on experience designing and operating large-scale distributed systems and cloud-native network architectures.
Proven experience specifically with AI infrastructure orchestration (e.g., using Kubernetes) and managing accelerated compute resources (GPUs, TPUs, etc.).
15+ years of Cloud backend engineering, Cloud Design, Deployment, DevOps.
15+ years of experience leading system design and architecture leveraging Private Clouds and AWS and/or Azure/GCP.
10+ years of demonstrable experience building and operating infrastructure as code, Infra Automation, and comfort with various flavors of Linux.
15+ years of experience in building high-performance, highly available, and scalable distributed systems in the cloud.
15+ years of experience in building and managing high-performance, highly available, and scalable Hybrid Cloud environments.
Excellent cross-group collaboration, outstanding verbal and written communication skills.
Skills:
Expert-level knowledge of containerization and orchestration (Docker, Kubernetes).
Software Defined Cloud Networking.
Strong background in DevOps and MLOps principles and tooling.
Proficiency in at least one modern programming language (e.g., Python, Go).
Exceptional strategic planning, organizational, and written/verbal communication skills.
Preferred
Prior experience managing infrastructure for training and inference of large language models (LLMs) or foundation models.
Experience in a regulated industry with strict compliance requirements.
- AI Private Cloud
- Building and operating.
Success Metrics
- A successful Senior Director
- AI Infrastructure Orchestration will be measured by:
The time-to-market for AI infrastructure build, scale, and operation.
The resource utilization rate and cost efficiency of the AI compute infrastructure.
The reliability and uptime of the core AI platform services.
The talent retention and development within the AI Infrastructure team.
Total Views
0
Apply Clicks
0
Mock Applicants
0
Scraps
0
Similar Jobs

Senior Engineering Program Manager
Juniper Networks · Ft. Collins, Colorado, United States of America

Lead Data Engineer
Capital One · McLean, VA

Intelligent Workplace Engineer – Employee Productivity AI
GlobalFoundries · 10 Locations

Technical Support Engineering Manager
Microsoft · India, Karnataka, Bangalore

CAD Mechanical Design Engineering Manager
GE Vernova · 2 Locations
About Coupang
Reviews
2.5
9 reviews
Work Life Balance
2.1
Compensation
2.8
Culture
2.4
Career
3.2
Management
1.9
25%
Recommend to a Friend
Pros
Growth opportunities and career advancement
Good compensation and timely pay
Remote work flexibility
Cons
Poor work-life balance and long hours
Toxic management and controlling behavior
Poor working conditions and workplace safety
Salary Ranges
1 data points
Principal/L7
Senior/L5
Principal/L7 · Data Scientist L7
0 reports
-
total / year
Base
-
Stock
-
Bonus
-
Interview Experience
2 interviews
Difficulty
4.0
/ 5
Offer Rate
50%
Experience
Positive 0%
Neutral 50%
Negative 50%
Interview Process
1
Application Review
2
Recruiter Screen
3
Behavioral/Culture Fit Interview
4
Coding Interview
5
Technical Interview
6
Final Round
Common Questions
Coding/Algorithm
System Design
Behavioral/STAR
Technical Knowledge
Culture Fit
News & Buzz
Coupang board role draws South Korean interest in Fed nominee Warsh - The Korea Economic Daily Global Edition
Source: The Korea Economic Daily Global Edition
News
·
4w ago
Coupang's Korean unit sent some $620 mil. in expenses to its US headquarters in 2024 - The Korea Times
Source: The Korea Times
News
·
4w ago
Coupang Earns 40 Trillion Won in Korea but Is Legally a U.S. Company...“Even Left Off the List of Conglomerate Owners” - 매일경제
Source: 매일경제
News
·
4w ago
Coupang, Inc. (CPNG) shares in focus amid mixed signals - MSN
Source: MSN
News
·
4w ago