
Investment services and wealth management firm
Sr Manager, AI Site Reliability Engineer at Charles Schwab
About the role
Your opportunity
At Charles Schwab, our purpose is simple: we champion client's goals with passion and integrity. Guided by honesty, mutual respect and a commitment to doing what's right, we bring innovation, education, and service together to help shape financial futures. Our people are the foundation of our success - they approach their work with curiosity and collaboration, coming together to create solutions that make a meaningful impact for clients and communities. As we expand into India, we are bringing this same culture of inclusion, learning, and opportunity to new talent. Joining us means becoming part of a global team where your work matters and your future can take shape.
Our Hyderabad location is central to Schwab's growth, bringing together talented people and technology to drive innovation, scale and efficiency. Here, you will work alongside teams who create solutions that support millions of clients every day. The work you do is more than daily operations - it's a chance to experiment, learn, and build within a values-driven, supportive environment. This is a unique opportunity to be part of our early growth phase and shape something new, backed by the stability and strength of a Fortune 500 company. Your impact begins on day one, and your contributions will help define our future in the region
We are seeking an AI Site Reliability Engineer to join a forward-thinking engineering team that builds intelligent observability, monitoring, and deployment automation solutions using AI-augmented development practices. This is not traditional production support. You will build software solutions for operational challenges, leveraging Gen AI to accelerate workflows, automate incident response, and drive systems toward five 9s (99.999%) availability while ensuring capacity planning, redundancy, and the highest standards of reliability, security, and scalability. This is a hands-on engineering role where you will actively contribute to architecture, code, and AI-driven tooling. Ideal for engineers who are curious, adaptable, and excited about working at the intersection of software engineering and AI.
Key Responsibilities
High Availability & Resilience:
- Design and implement architectures that achieve and sustain 99.999% uptime across critical systems
- Define, measure, and track SLOs, SLIs, and error budgets
- Build AI-powered self-healing systems with automated failover, redundancy, and graceful degradation
- Perform AI-assisted capacity planning, demand forecasting, and load testing
- Conduct chaos engineering practices enhanced with AI-driven failure prediction
Observability & Monitoring:
- Design, build, and maintain AI-enhanced observability platforms covering metrics, logs, traces, and intelligent alerting
- Implement AI-powered anomaly detection, predictive alerting, and proactive system health management
- Leverage Gen AI to auto-generate and refine dashboards, alert rules, and runbooks
- Build real-time availability dashboards with AI-driven trend analysis tracking 99.999% targets
Root Cause Analysis:
- Build AI-accelerated root cause analysis with thorough postmortems and actionable remediation
- Build AI-powered diagnostic tools that automatically correlate logs, metrics, and traces
- Use Gen AI to analyze incident patterns, predict recurring failures, and recommend preventive actions
- Build AI agents that automate initial triage and preliminary RCA for common incident types
- Continuously reduce MTTD and MTTR through AI-assisted workflows
Deployment Automation
- Design and maintain AI-enhanced CI/CD pipelines with AI-driven deployment risk scoring
- Implement progressive delivery with canary releases, blue-green deployments, and automated rollbacks triggered by AI anomaly detection
- Leverage Gen AI to generate deployment scripts, IaC templates, and pipeline configurations
- Eliminate manual toil by building AI agents for repetitive operational tasks
AI-Driven Development
- Use GenAI tools (GitHub Copilot, Cursor, ChatGPT) for coding, debugging, documentation, and operations
- Developer ownership is non-negotiable. All code, whether human or AI-generated, must be reviewed, tested, and understood before merging
- Design and develop AI agents for incident response, log analysis, capacity management, and operational automation
- Define agent goals, tool use, memory, and orchestration logic for multi-step SRE workflows
- Apply spec-driven development and continuously evaluate emerging Gen AI tools
Testing & Quality:
- Leverage Gen AI to generate tests, identify coverage gaps, and create edge-case scenarios
- Drive AI-powered security scanning, performance testing, and reliability validation early in development
Modernization & Collaboration:
- Modernize existing monitoring, alerting, and deployment systems using AI-assisted workflows
- Identify technical debt and propose AI-accelerated remediation strategies with measurable outcomes
- Participate in architecture discussions, design reviews, and code reviews
- Develop and maintain prompt engineering guidelines and custom instructions for consistent AI-assisted development
- Share learnings, patterns, and tooling insights with the broader team
What Success Looks Like:
- Drives and sustains 99.999% availability through AI-enhanced reliability engineering
- Delivers AI-powered observability that predicts issues before customer impact
- Continuously reduces MTTD and MTTR through AI-accelerated RCA and automated diagnostics
- Builds fully automated deployment pipelines with zero-downtime releases and minimal toil
- Takes full ownership of all code, including AI-generated output, with rigorous review and validation
- Actively contributes to improving the team's AI-enabled engineering practices
- Grows prompt engineering, agent development, and Gen AI skills alongside core SRE fundamentals
What you have Required Qualifications:
- 8+ years of software engineering experience with a strong focus on SRE, DevOps, or platform engineering
- Bachelor's degree in Computer Science or equivalent
- Hands-on technically with ability to architect, review code, and contribute to critical decisions
- Proven experience operating systems at 99.99% or higher availability
- Hands-on experience with GenAI coding tools and demonstrated ability to apply them effectively
- Proficiency in Java or .NET, with scripting skills in Python, Bash, or PowerShell
- Experience building AI agents for operational automation
- Strong prompt engineering skills with ability to contribute to team-level AI practices
- Deep understanding of HA patterns including active-active, multi-region failover, and distributed consensus
- Hands-on experience with observability platforms (Datadog, Splunk, Grafana, Prometheus, ELK, or similar)
- Strong experience with CI/CD pipelines (Jenkins, GitLab CI, GitHub Actions, ArgoCD, or similar)
- Proven track record leading RCA efforts and driving reliability improvements
- Strong CS fundamentals including system design, networking, concurrency, and algorithms
Preferred Qualifications
- Experience with cloud platforms (GCP, AWS, or Azure) and IaC (Terraform, Pulumi, or similar)
- Experience with containerization and orchestration (Docker, Kubernetes)
- Knowledge of chaos engineering and ML-powered anomaly detection
- Experience with legacy system modernization
What's in it for you
At Schwab India, you're empowered to shape your future. We support your growth through meaningful work, continuous learning, and a culture rooted in trust and collaboration - so you can build the skills to make a lasting impact. Our benefits are designed to care for your wellbeing, your family, and your long-term financial security.
Our base benefits, wellbeing, and total rewards include:
- Competitive compensation and retirement programs including Employee Provident Fund (EPF), Gratuity, and optional National Pension System (NPS) contributions
- Robust Paid Time Off, including annual/privilege leave, sick and casual leave, public holidays, maternity/paternity leave, and more
- Education assistance for continued learning to help you grow
- Comprehensive medical insurance with Outpatient Department (OPD) services, including vaccination, pharmacy, dental, and vision coverage
- Annual reimbursement for health check-ups and mental health support through our Employee Assistance Program (EAP)
- Childcare (creche) reimbursement for eligible employees
- Transportation and meal benefits that support your day-to-day work
- Group life, personal accident, and critical illness insurance
Required skills
Site reliability engineering
Observability
Monitoring
Automation
Deployment engineering
AI engineering
Software engineering
Problem solving
Total Views
0
Total Apply Clicks
0
Total Mock Apply
0
Total Bookmarks
0
More open roles at Charles Schwab

Sr Manager, IT Product Management Senior
Charles Schwab · Lone Tree, CO

VP - Financial Consultant - Seattle, WA
Charles Schwab · Seattle, WA

Sr Manager, Software Development & Engineering Lead (PL)
Charles Schwab · Austin, TX; Southlake, TX

Investment Consultant - Leesburg, VA
Charles Schwab · Leesburg, VA

Senior Product Manager - Trading Platform
Charles Schwab · Austin, TX; Chicago, IL; Omaha, NE; Orlando, FL; Raleigh, NC; Westlake, TX
Similar jobs

Senior. Principal Platform IAC Engineer
RTX (Raytheon) · US-CO-AURORA-S75 ~ 16800 E Centretech Pkwy ~ BLDG S75

Senior Platform DevOps Engineer (Onsite)
RTX (Raytheon) · US-CO-AURORA-S75 ~ 16800 E Centretech Pkwy ~ BLDG S75

Platform Operations Engineer
RTX (Raytheon) · US-MA-MARLBOROUGH-MA2 ~ 1001 Boston Post Rd ~ BLDG 2

Deployment Specialist
Wipro · Copenhagen, Denmark

Azure Cloud Engineer
Accenture
About Charles Schwab

Charles Schwab
PublicCharles Schwab Corporation is a major American multinational financial services company that provides brokerage, banking, and financial advisory services to individual and institutional clients.
10,001+
Employees
Westlake
Headquarters
$134B
Valuation
Reviews
10 reviews
4.2
10 reviews
Work-life balance
3.8
Compensation
4.2
Culture
4.5
Career
3.2
Management
4.0
75%
Recommend to a friend
Pros
Supportive and approachable management
Great work-life balance and flexibility
Excellent benefits and competitive pay
Cons
High pressure and demanding workload
Limited career advancement opportunities
Fast-paced environment causing stress
Salary Ranges
30 data points
L2
L6
Senior/L5
Director
L3
L4
L5
L2 · Cybersecurity Analyst L2
0 reports
$97,500
total per year
Base
$39,000
Stock
$48,750
Bonus
$9,750
$68,250
$126,750
Interview experience
7 interviews
Difficulty
3.0
/ 5
Duration
14-28 weeks
Offer rate
28%
Experience
Positive 14%
Neutral 58%
Negative 28%
Interview process
1
Phone Screen
2
Interview
3
Background Check
Common questions
Phone Interview
Recruiter Screening
Technical Assessment
Latest updates
Erste Group Bank Predicts Charles Schwab FY2026 Earnings - MarketBeat
MarketBeat
News
·
1w ago
Schwab finds most teens want to invest, but split with parents on control - InvestmentNews
InvestmentNews
News
·
1w ago
Early Start, Long-Term Mindset: Teens Increasingly Interested in Investing - Investing News Network
Investing News Network
News
·
1w ago
Major Insider Move at Charles Schwab Shakes Up Investor Watchlists - TipRanks
TipRanks
News
·
1w ago