refresh

트렌딩 기업

트렌딩

채용

JobsJFrog

Site Reliability Engineer

JFrog

Site Reliability Engineer

JFrog

Bangalore

·

On-site

·

Full-time

·

1w ago

Required Skills

SRE

DevOps

Kubernetes

Docker

Python

Go

Incident Response

Cloud Platforms

Fast-Frogward Your Career to Years From Now

JFrog is the only end-to-end software supply chain platform that provides complete visibility, security, and control for automating the delivery of trusted releases from code to production. Our platform enables organizations to manage, secure, and automate their software delivery process, fueling innovation without worry. We empower companies to build and release software faster and more securely than ever before.

With over 7,500 customers worldwide, including many Fortune 100 companies, JFrog is at the forefront of global innovation. Join us in shaping the future of software delivery and contributing to solutions that empower some of the world's most influential industries.

Be part of a team where your work takes centre stage, shaping the future of software development. At JFrog, as a Full Stack Engineer, you’ll solve critical challenges for leaders like Amazon, Google, and Netflix. Every day brings opportunities to innovate and push boundaries in a fast-moving, frogward-thinking culture. It’s more than writing code—it’s driving the technology that powers the world. If you want your work to matter and thrive on nonstop innovation, JFrog is your place.

We’re hiring a Site Reliability Engineer to help improve the availability, performance, scalability, and operational excellence of our SaaS environments. You’ll work closely with Engineering and Cloud teams to automate operations, strengthen observability, and improve incident response using modern SRE practices (SLOs/SLIs, error budgets, postmortems). This role is hands-on, collaborative, and impact-focused. If you're eager to make a significant impact in a fast-paced, high-growth environment, we encourage you to apply.

As a Site Reliability Engineer in JFrog you will be responsible for:

  • Improve reliability, scalability, performance, and observability for JFrog SaaS services in partnership with engineering teams.

  • Implement SRE practices: define SLOs/SLIs, run failure analysis, support capacity planning, perform service readiness reviews and drive tech-debt reliability improvements.

  • Support day-to-day operations of our Multi Cloud Global Distributed Cloud Native Kubernetes-based SaaS environments to keep services available, performant, cost efficient and scalable.

  • Build and enhance internal services and tools to streamline operations and reduce toil through automation.

  • Develop and maintain Python/Go automation to improve deployment safety, incident response and operational visibility.

  • Run Po Cs, prototype, and drive implementations for agentic automation using an ADK/agent framework, leveraging AI where it meaningfully improves operational & strategic excellence.

  • Support resilience testing/chaos experiments(as appropriate) and improve disaster recovery readiness.

  • Participate in on-call, lead incidents to resolution, and drive postmortems and follow-up actions that prevent recurrence.

  • Act as a primary contact for SaaS production issues, collaborating closely with Product sengineering groups.

  • Evaluate cloud-native technologies and vendor solutions that improve SaaS reliability and lifecycle management.

To be a Site Reliability Engineer in JFrog you need...

  • Experience: 4+ years in SRE, DevOps, or Production Engineering in large-scale production environments.

  • Cloud & Orchestration: Production experience with Kubernetes (Docker) and at least one cloud provider (AWS, GCP, or Azure).

  • SRE Fundamentals: Working knowledge of SLO/SLI, alerting strategy, incident response, postmortems, and reliability improvements.

  • Development: Proficiency in Python or Go for automation, integrations, and internal tools.

  • Observability: Hands-on with metrics/logs/traces using tools like New Relic, Coralogix, Prometheus, Grafana, Open Telemetry (or equivalents).

  • Incident & Resilience: Strong incident response and triage using Pager Duty/Opsgenie (or equivalent);
    Exposure to chaos/resilience testing (e.g., Gremlin) and DR readiness.

  • AI/Agentic Ops: Practical use of AI-assisted operations (e.g., log/incident summarization, triage helpers); familiarity building simple agents with an ADK/agent framework (e.g., Lang Graph, Lang Chain, CrewAI, or similar).

  • CI/CD: Working knowledge of microservices delivery using Jenkins, ArgoCD, or equivalent.

  • Soft Skills: Strong documentation (runbooks, postmortems) and a collaborative, independent problem-solving mindset.

NOTE: We are located in Bangalore (Bellandur) and follow a 3 days from office (mandatory), hybrid work model.

Total Views

0

Apply Clicks

0

Mock Applicants

0

Scraps

0

About JFrog

JFrog

JFrog

Public

JFrog provides DevOps and DevSecOps platform solutions for software development and distribution. The company offers tools for artifact management, security scanning, and CI/CD pipeline automation.

1,001-5,000

Employees

Bozeman

Headquarters

$1.5B

Valuation

Reviews

2.6

9 reviews

Work Life Balance

2.3

Compensation

4.0

Culture

2.8

Career

3.2

Management

2.1

35%

Recommend to a Friend

Pros

Good compensation and benefits

Supportive team and welcoming environment

Fast-paced and innovative culture

Cons

Poor management and micromanagement

Toxic and fearful work environment

Fast-paced changes and unrealistic expectations

Salary Ranges

89 data points

Junior/L3

Senior/L5

Junior/L3 · Business Development Representative (BDR)

6 reports

$81,624

total / year

Base

$58,363

Stock

-

Bonus

-

$55,299

$124,042

Interview Experience

35 interviews

Difficulty

3.4

/ 5

Duration

14-28 weeks

Offer Rate

40%

Experience

Positive 62%

Neutral 22%

Negative 16%

Interview Process

1

Phone Screen

2

Technical Interview

3

Hiring Manager

4

Team Fit

Common Questions

Technical skills

Past experience

Team collaboration

Problem solving