refresh

Trending companies

Trending companies

Jobs

JobsJPMorgan Chase

Lead Infrastructure Engineer

JPMorgan Chase

Lead Infrastructure Engineer

JPMorgan Chase

Singapore, SG

·

On-site

·

Full-time

·

1mo ago

Required skills

AWS

Kubernetes

Terraform

Linux

Assume a vital position as a key member of a high-performing team that delivers infrastructure and performance excellence. Your role will be instrumental in shaping the future at one of the world's largest and most influential companies.

As a Lead Infrastructure Engineer at JPMorgan Chase within the Infrastructure Platform, you apply deep knowledge of software, applications, and technical processes within the infrastructure engineering discipline. Continue to evolve your technical and cross-functional knowledge outside of your aligned domain of expertise.

Job responsibilities

  • Lead production operations for critical services: act as incident commander for Priority 1/2 events, drive rapid restoration, clear communications, and post-incident reviews with owned, time-bound remediations.
  • Own stability and resiliency improvements: implement and standardize patterns (timeouts/retries, circuit breakers, bulkheads, back-pressure, graceful degradation) and run failover/chaos exercises to validate recovery.
  • Drive cross-platform architecture and modernization: partner with application, platform, and security teams to design and implement changes that reduce operational risk and improve reliability and performance.
  • Deliver hands-on design, development, and troubleshooting for complex infrastructure issues; create durable fixes and automation that prevent recurrence and reduce manual toil.
  • Manage workstreams end-to-end across one or more infrastructure domains (e.g., Kubernetes, Linux, networking, databases, cloud), ensuring clear scope, milestones, and measurable outcomes.
  • Apply strong systems thinking: assess upstream/downstream dependencies and data flows; identify technical implications and advise on mitigation, rollout sequencing, and safe change strategies.
  • Operate effectively in a 24/7 model: support on-call rotations, improve runbooks and diagnostics, and continually raise the bar on detection, alert quality, and response time.

Required qualifications, capabilities, and skills

  • Bachelor’s Degree in Computer Science, Cybersecurity, Data Science, or related disciplines
  • 5+ years of relevant infrastructure engineering experience, with increasing scope/ownership.
  • Deep expertise in one or more core areas: compute and OS (Linux), networking, databases/storage, container orchestration, CI/CD and deployment practices, integration/automation, scaling, resiliency, and performance engineering.
  • Strong observability and monitoring proficiency, including metrics, logs, distributed tracing, alerting, and SLO/SLA design.
  • Demonstrated troubleshooting across heterogeneous platforms and services, with hands-on administration in Linux, middleware, and databases.
  • Practical experience operating modern infrastructure stacks: Linux, Kubernetes, AWS, Terraform; and observability tooling such as Splunk, Grafana, Datadog, AWS X-Ray.
  • Database exposure with one or more of: Cassandra, Oracle, CockroachDB; ability to assess performance, capacity, and resilience trade-offs.
  • Proficiency in scripting and software engineering for infrastructure (e.g., Bash, Python); ability to build automation, tooling, and integrations.
  • Deep knowledge of cloud infrastructure and services across public and private clouds, including migration patterns and hybrid connectivity.
  • Experience identifying and resolving production issues on public cloud platforms; ability to lead service improvement plans and problem management.
  • Proven experience with LLM orchestration frameworks or custom agent runtimes; strong API design, reliability engineering, and end-to-end observability (tracing/metrics/logging). Delivered at least one agentic system to production with quantified impact (e.g., automation rate, latency, cost).

Preferred qualifications, capabilities, and skills

  • Incident leadership: serves as incident commander for Sev1/Sev2 events, drives clear comms and rapid restoration, and ensures post-incident reviews with owned, time-bound remediations.
  • SRE practices at scale: defines/enforces SLOs/SLIs and error budgets; improves on-call quality with actionable runbooks, sustainable alerting, and clear escalation paths.
  • Observability and automation: advances metrics/logs/traces and synthetic probes ; builds self-heal automation for diagnostics and common remediations.

Total Views

0

Apply Clicks

0

Weekly mock applicants

0

Bookmarks

0

About JPMorgan Chase

JPMorgan Chase

JPMorgan Chase & Co. is an American multinational banking institution headquartered in New York City and incorporated in Delaware. It is the largest bank in the United States, and the world's largest bank by market capitalization as of 2025.

300,000+

Employees

New York City

Headquarters

$500B

Valuation

Reviews

3.8

10 reviews

Work-life balance

3.5

Compensation

4.0

Culture

3.8

Career

3.2

Management

2.8

68%

Recommend to a friend

Pros

Good benefits and compensation

Supportive colleagues and environment

Flexible work arrangements

Cons

Long hours and heavy workload

Management issues and lack of direction

High stress and expectations

Salary Ranges

44 data points

Junior/L3

Mid/L4

Senior/L5

Junior/L3 · Analytics Solutions Associate

1 reports

$139,000

total per year

Base

$107,000

Stock

-

Bonus

-

$139,000

$139,000

Interview experience

4 interviews

Difficulty

3.0

/ 5

Duration

14-28 weeks

Offer rate

50%

Experience

Positive 25%

Neutral 75%

Negative 0%

Interview process

1

Application Review

2

HR Screen

3

Hiring Manager Interview

4

In-person/Final Interview

5

Offer

Common questions

Behavioral/STAR

Past Experience

Culture Fit

Financial Knowledge

Case Study