Jobs

Staff Reliability Engineer

Hartford

4 Locations

On-site

Full-time

1w ago

Compensation

$127,600 - $191,400

Benefits & Perks

•Healthcare

•401(k)

•Flexible Hours

•Healthcare

•401k

•Flexible Hours

Required Skills

Python

Infrastructure-as-code

CI/CD

Cloud platforms

Monitoring and alerting

DataOps

AIOps

Staff Reliability Engineer
IE07KE

We’re determined to make a difference and are proud to be an insurance company that goes well beyond coverages and policies. Working here means having every opportunity to achieve your goals – and to help others accomplish theirs, too. Join our team as we help shape the future.

The Hartford is seeking a highly skilled Senior Reliability Engineer (RE) to join our Enterprise Data Organization. This role is pivotal in applying software engineering principles to operations, ensuring the reliability, performance, and scalability of our foundational data infrastructure, platforms and applications in this organization. You will be instrumental in driving our transition from traditional production support to a modern RE model through automation, toil reduction, and standardized service management.

This role can have a Hybrid or Remote work arrangement. Candidates who live near one of our locations will have the expectation of working in an office 3 days a week (Tuesday through Thursday). Candidates who do not live near an office should maintain their current work arrangement with the expectation of coming into the office as business needs arise

Responsibilities:

Platform Reliability & Resiliency: Design, build, and maintain highly reliable, scalable, and resilient cloud-based data platforms on AWS and GCP, including core infrastructure and services like Snowflake, EKS, Open Search, EMR and Hadoop ecosystems.
Automation & Toil Reduction: Champion the RE mandate by identifying manual, repetitive operational tasks (toil) and developing robust automation solutions to eliminate them. This includes automating provisioning, deployment, self-healing and operational tasks.
Observability & Monitoring: Implement and manage comprehensive observability solutions (monitoring, alerting, logging, tracing) for the underlying data infrastructure, applications focusing on establishing clear Service Level Indicators (SLIs), Service Level Objectives (SLOs).
Incident Response & Management: Act as an escalation point for production incidents, leading incident response, performing deep root cause analysis (RCA), designing error budgets and implementing preventative measures to ensure issues do not recur
Standardization & Documentation: Lead the standardization of operational processes and documentation, including the creation and automation of dynamic runbooks and playbooks for consistent and efficient incident resolution and service management.
RE Transition: Leads as RE Subject Matter Expert and collaborate with other Platform, Product and Data Engineering Support teams to instill RE best practices, including participation in system design consulting, capacity planning, and deployment pipelines (CI/CD).

Qualifications:

10+ year’s overall experience in an Infrastructure, Data or related technology organization with increasing responsibilities as a hands-on technologist.
Must have 5+ year experience as an RE, Cloud, DevOps Engineer, or similar role supporting large-scale enterprise infrastructure and applications.
Strong scripting and programming skills (Python etc.) for automation and tooling development.
Experience with infrastructure-as-code (e.g., Terraform, CloudFormation, Ansible) and CI/CD tools.
Experience designing and operating reliable and resilient infrastructure, fail-safe patterns, reliability controls, and observability from a Reliability Engineering (SRE/RE) infrastructure support perspective across cloud and big data platforms (AWS, GCP, Amazon EMR, Hadoop/Spark, Open Search, and container orchestration platforms etc.)
Familiarity with cloud-native integrations with databases, data integration, and business intelligence platforms (Snowflake, Informatica IDMC, Tableau, and Thought Spot etc.)
Expertise in setting up and tuning monitoring and alerting systems (e.g., Dynatrace, Splunk, Prometheus, Grafana, Datadog, Open Telemetry etc.).
Expertise defining and implementing of Data Ops practices
Expertise implementing AIOps to monitor, manage and self-heal infrastructure, data platforms, experience implementing machine learning principles for anomaly detection, alerting and runbook automation.
Experience with prompt engineering, implementing AWS or Google AI services, AI enabled automation for infrastructure reliability and performance management.
Relevant industry certifications preferred (AWS, GCP, Kubernetes, SRE/DevOps frameworks etc.)

This role will have a Hybrid work schedule, with the expectation of working in an office (Columbus, OH, Chicago, IL, Hartford, CT or Charlotte, NC) 3 days a week (Tuesday through Thursday).

Candidates must be authorized to work in the US without company sponsorship. The company will not support the STEM OPT I-983 Training Plan endorsement for this position.

Compensation

The listed annualized base pay range is primarily based on analysis of similar positions in the external market. Actual base pay could vary and may be above or below the listed range based on factors including but not limited to performance, proficiency and demonstration of competencies required for the role. The base pay is just one component of The Hartford’s total compensation package for employees. Other rewards may include short-term or annual bonuses, long-term incentives, and on-the-spot recognition. The annualized base pay range for this role is:

$127,600 - $191,400

Equal Opportunity Employer/Sex/Race/Color/Veterans/Disability/Sexual Orientation/Gender Identity or Expression/Religion/Age

About Us | Our Culture | What It’s Like to Work Here | Perks & Benefits

Total Views

Apply Clicks

Mock Applicants

Scraps

Similar Jobs

Senior Machine Learning Engineer

ZS Associates · Bengaluru, India; Gurgaon, India; Pune, India

Member Of Technical Staff- React Native Developer

Athenahealth · Pune India

Propulsion Engineer III - BE-3U Upgrades

Blue Origin · 2 Locations

Senior Systems Dynamics & Controls Engineer (Hybrid-Puerto Rico)

Collins Aerospace (RTX) · HPR01: Collins Puerto Rico- Aguadilla Road 110 North Km 28.8 San Antonio Industrial Park, Aguadilla, PR, 00603 USA

Senior Frontend Engineer - Product/Features (Remote across ANZ)

Canva · Sydney

About Hartford

Hartford

Bootstrapped

An e-store that retails different types of clothes and accessories for men, women, and children.

51-200

Employees

Paris

Headquarters

Reviews

3.8

9 reviews

Work Life Balance

2.8

Compensation

2.5

Culture

3.9

Career

3.2

Management

3.1

67%

Recommend to a Friend

Pros

Great training programs

Good company culture and work environment

Supportive and accessible management

Cons

Management issues and instability

Low compensation and merit increases

High call volume and long hours

Salary Ranges

29 data points

Mid/L4

Senior/L5

Mid/L4 · BUSINESS INTELLIGENCE DEVELOPER

1 reports

$107,484

total / year

Base

$82,680

Stock

Bonus

$107,484

Interview Experience

3 interviews

Difficulty

3.3

/ 5

Duration

14-28 weeks

Experience

Positive 0%

Neutral 67%

Negative 33%

Interview Process

Phone Interview

Video Interview

Analyst Interview

Trader Interview

Vice President Interview

News & Buzz

Hartford Insurance Group (HIG) Sub 90% Combined Ratios Reinforce Bullish Underwriting Narratives - simplywall.st

Source: simplywall.st

News

5w ago

The Hartford Insurance Group Q4 Earnings Call Highlights - Yahoo Finance

Source: Yahoo Finance

News

5w ago

New show inspired by Hartford Circus Fire debuts in Connecticut this spring - CT Insider

Source: CT Insider

News

5w ago

TheaterWorks Hartford takes world premiere play to new downtown location - Hartford Business Journal

Source: Hartford Business Journal

News

5w ago