refresh

Trending Companies

Trending

Jobs

JobsCoupang

Sr. Staff Site Reliability Engineer

Coupang

Sr. Staff Site Reliability Engineer

Coupang

Seattle, USA

·

On-site

·

Full-time

·

1w ago

Compensation

$176,000 - $221,000

Benefits & Perks

Healthcare

Dental

Vision

Life Insurance

401(k)

Flexible Spending Accounts

Health Savings Account

Disability Insurance

Employee Assistance Program

Paid Time Off

Parental Leave

Commuter Benefits

Equity

Healthcare

401k

Parental Leave

Commuter

Equity

Required Skills

Distributed Systems

System Design

Incident Management

Monitoring

Observability

Job Overview:

Site Reliability Engineers (SREs) at Coupang is a mission-critical role which combines software and system engineering to build, run and scale our complex, large-scale ecommerce systems. As part of the Site Reliability Engineering team, you will be responsible for ensuring all our customer facing services are healthy, monitored, automated, and designed to scale. As SRE organization we take pride in handling “operations as an engineering” problem with automation first approach. You will use your background to build best in class infrastructure automation for areas such as Observability, Incident management, Disaster Recovery, Load testing, Capacity engineering and many more. In this role you will work very closely with our product development teams from an early stage of design to all the way helping resolve any production incidents, maintaining SLI/SLA bar for production services and influencing them with SRE principles and best practices. If you take pride in complete ownership, have a passion for solving complex technical challenges for large scale distributed systems and demeanor to work and communicate effectively across team boundaries, this is the role for you!

Key Responsibilities:

  • Serve as a primary point responsible for the platform reliability, health, and performance of all Coupang customer-facing services.

  • Gain deep knowledge of Coupang application workflow and dependencies.

  • Define and track key performance indicators (KPIs) and service-level objectives (SLOs) related to system availability, performance, and reliability.

  • Build world class incident management process and automation, including fast incident remediation, incident operational reviews and retrospectives.

  • Develop and implement best practices for creating , Scaling and maintaining effective monitoring, alerting, and telemetry systems.

  • Build automation to execute regular Disaster Recovery testing, Chaos testing and load testing to stay ahead of expected growth of Coupang services.

  • Work closely with product development teams to ensure the products are designed with scale and operability in mind.

  • Build right guardrails and automation for deploying production changes holding the reliability bar.

  • Participate in a 24x7 rotation for production issue escalations, functions well in a fast-paced environment.

  • Communicate effectively with people at all levels of the organization.

Basic Qualifications:

  • Bachelor's degree in computer science, Engineering, or a related technical field.

  • 8+ years of industry experience building and operating large scale distributed systems

Preferred Qualifications:

  • Prior experience working with AI/ML, large scale web-based Java architectures and JVM configuration.

  • Professional certifications in cloud platforms, monitoring tools, or related technologies.

Previous experience working on a large-scale GPU/Cloud Infrastructure platforms.

  • SLO/SLA management and implementation experience

  • Deep UNIX/Linux systems knowledge and administration background.

  • Demonstrated programming skills in one or more of: Python, Java, Golang, Ruby.

  • Strong problem-solving and analytical skills spanning systems, network (TCP/IP) and code, with a focus on data-driven decision-making.

  • Experience with cloud-based GPU infrastructure, including AWS, Azure, or Google Cloud Platform.

  • Strong understanding of DevOps and SRE practices, including continuous integration, continuous delivery, and infrastructure as code (IaC).

  • Experience with containerization and orchestration technologies, such as Docker and Kubernetes.

  • Excellent communication and collaboration skills, with the ability to work with teams across distinct functions and technical domains.

  • Knowledge of open telemetry observability ecosystem including metrics, logging, tracing and tools, such as Prometheus, Grafana, Elastic Stack, Datadog, or New Relic.

Pay & Benefits

Our compensation reflects the cost of labor across several US geographic markets. At Coupang, your base pay is one part of your total compensation.

The base pay for this position ranges from $176,000/year in our lowest geographic market to $221,000/yearin our highest geographic market. Pay is based on several factors including market location and may vary depending on job-related knowledge, skills, and experience.

General Description of All Benefits

  • Medical/Dental/Vision/Life, AD&D insurance

  • Flexible Spending Accounts (FSA) & Health Savings Account (HSA)

  • Long-term/Short-term Disability

  • Employee Assistance Program (EAP) program

  • 401K Plan with Company Match

  • 18-21 days of the Paid Time Off (PTO) a year based on the tenure

  • 12 Public Holidays

  • Paid Parental leave

  • Pre-tax commuter benefits

  • MTV - Free Electric Car Charging Station

General Description of Other Compensation

“Other Compensation” includes, but is not limited to, bonuses, equity, or other forms of compensation that would be offered to the hired applicant in addition to their established salary range or wage scale.

Coupang is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to actual or perceived race (including traits historically associated with race, including but not limited to hair texture and protective hair styles), color, religion, religious creed (including religious dress and grooming practices), sex or gender (including pregnancy, childbirth, breastfeeding, and medical conditions related to pregnancy, childbirth or breastfeeding), gender identity, gender expression, sexual orientation, ,ancestry, national origin (including language use restrictions), age (40 and over), physical or mental disability, medical condition, genetic information, HIV/AIDS or Hepatitis C status, family status (including but not limited to marital or domestic partnership status), military or veteran status, use of a trained dog guide or service animal, political activities or affiliations, ancestry, citizenship, family and medical leave status, status as a victim of any violent crime, or any other characteristic or class protected by the laws or regulations in the locations where we operate.Coupang is also committed to providing a safe work environment for its employees and its consumers.If you need assistance and/or a reasonable accommodation in the application of recruiting process due to a disability, please contact us at usrecruiting@coupang.com

Requisition: R0065794

Equal Opportunities for All

Coupang is an equal opportunity employer. Our unprecedented success could not be possible without the valuable inputs of our globally diverse team.

Total Views

0

Apply Clicks

0

Mock Applicants

0

Scraps

0

About Coupang

Coupang

An e-commerce platform that offers a wide variety of products, including apparel, electronics, and home goods.

10,001+

Employees

Seoul

Headquarters

$109B

Valuation

Reviews

2.5

9 reviews

Work Life Balance

2.1

Compensation

2.8

Culture

2.4

Career

3.2

Management

1.9

25%

Recommend to a Friend

Pros

Growth opportunities and career advancement

Good compensation and timely pay

Remote work flexibility

Cons

Poor work-life balance and long hours

Toxic management and controlling behavior

Poor working conditions and workplace safety

Salary Ranges

1 data points

Principal/L7

Senior/L5

Principal/L7 · Data Scientist L7

0 reports

-

total / year

Base

-

Stock

-

Bonus

-

Interview Experience

2 interviews

Difficulty

4.0

/ 5

Offer Rate

50%

Experience

Positive 0%

Neutral 50%

Negative 50%

Interview Process

1

Application Review

2

Recruiter Screen

3

Behavioral/Culture Fit Interview

4

Coding Interview

5

Technical Interview

6

Final Round

Common Questions

Coding/Algorithm

System Design

Behavioral/STAR

Technical Knowledge

Culture Fit