
Cloud applications and platform services.
Principal Member of Technical Staff - DevOps (US Citizen Required)
As a Principal Member of Technical Staff (DevOps), you will play a pivotal role in building and operating the next-generation, AI-first Electronic Health Record platform. This role blends strong software engineering fundamentals with Site Reliability Engineering (SRE) and production engineering practices to deliver highly scalable, resilient, secure, and observable cloud-native services. You will design, develop, and own complex distributed systems end-to-end—from architecture and implementation to production operations, reliability, and continuous improvement. Working closely with technical leads and cross-functional teams, you will ensure services are built using modern engineering principles with a strong focus on availability, scalability, performance, operability, and cost-awareness. You will embed SRE practices such as SLI/SLO definition, error budgets, observability, incident response, and automated remediation into the development lifecycle. You will proactively improve system reliability through automation, data-driven insights, structured operational workflows, and production engineering excellence (including safe experimentation and resilience testing where appropriate). You will also leverage AI-assisted development tools to accelerate delivery, improve troubleshooting, and enhance engineering productivity—while maintaining rigorous standards for code quality, security, and reliability.
Responsibilities
- Design, build, and operate scalable, secure, and maintainable distributed services in a cloud-native, microservices-based environment.
- Drive architecture and implementation decisions aligned with reliability, performance, and operability requirements.
- Deliver high-quality code with strong CI/CD, automated testing, and release engineering practices.
- Define and operationalize SLIs/SLOs, manage error budgets, and continuously improve service reliability.
- Build and enhance observability across services (metrics, logs, traces), including actionable dashboards and alerting.
- Lead and participate in incident management, on-call/operational readiness,root cause analysis (RCA), and blameless postmortems.
- Build, improve, and standardize operational workflows (runbooks, playbooks, change management, escalation paths, and service readiness reviews).
- Develop and maintain automation for operational excellence: self-healing, automated remediation, drift detection, and reliability guardrails.
- Use automation tools and frameworks to reduce toil and increase consistency across environments.
- Apply AI tools to support coding, debugging, alert/incident triage, and operational insights (AIOps-aligned workflows where appropriate).
Minimum Qualifications
- BS/MS in Computer Science (or equivalent practical experience).
- Must be a U.S. citizen with ability to obtain & maintain a Federal Security Clearance
- At least 7 years of relevant software engineering experience.
- Proficient in at least one (preferably two) of: Java, C/C++, Golang.
- Hands-on experience in SRE or similar roles (DevOps / Production Engineering).
- Proven, hands-on experience with automation tools and frameworks (e.g., infrastructure/app automation, CI/CD automation, operational runbook automation).
- Strong scripting skills(e.g.,Python, Bash, or similar).
- Demonstrated experience building or improving operational workflows in production environments.
- Strong understanding of reliability engineering, monitoring/observability, and incident management (including RCA and postmortems).
AI-Assisted Engineering
- Demonstrated experience using AI-assisted development tools/IDEs (e.g., Codex, Claude, Cline, or similar) and integrating them into development workflows to improve productivity and reduce turnaround time.
- Experience using ChatGPT, Claude, or similar models to support development and operational tasks (e.g., code generation, debugging, documentation, triage).
Preferred Qualifications
- Experience with containers, Kubernetes, and operating reliable services at scale.
- Familiarity with MCP tools/servers and multi-tool orchestration / skills-based frameworks.
- Familiarity with “AI-accelerated” development approaches (rapid prototyping plus disciplined engineering, testing, and operational readiness).
- Strong CS fundamentals: data structures, algorithms, operating systems, networking, and distributed systems.
- Excellent communication and collaboration skills; comfortable working across teams and communicating technical topics to senior stakeholders.
- Experience contributing to intelligent automation and AIOps-driven workflows.
Disclaimer:
Certain US customer or client-facing roles may be required to comply with applicable requirements, such as immunization and occupational health mandates.Range and benefit information provided in this posting are specific to the stated locations only
US: Hiring Range in USD from: $96,800 to $223,400 per annum. May be eligible for bonus and equity.
Oracle maintains broad salary ranges for its roles in order to account for variations in knowledge, skills, experience, market conditions and locations, as well as reflect Oracle's differing products, industries and lines of business.
Candidates are typically placed into the range based on the preceding factors as well as internal peer equity.
Oracle US offers a comprehensive benefits package which includes the following:
1. Medical, dental, and vision insurance, including expert medical opinion
2. Short term disability and long term disability
3.
Life insurance and AD&D:
4. Supplemental life insurance (Employee/Spouse/Child)
5. Health care and dependent care Flexible Spending Accounts
6. Pre-tax commuter and parking benefits
7. 401(k) Savings and Investment Plan with company match
8. Paid time off: Flexible Vacation is provided to all eligible employees assigned to a salaried (non-overtime eligible) position. Accrued Vacation is provided to all other employees eligible for vacation benefits. For employees working at least 35 hours per week, the vacation accrual rate is 13 days annually for the first three years of employment and 18 days annually for subsequent years of employment. Vacation accrual is prorated for employees working between 20 and 34 hours per week. Employees working fewer than 20 hours per week are not eligible for vacation.
9. 11 paid holidays
10. Paid sick leave: 72 hours of paid sick leave upon date of hire. Refreshes each calendar year. Unused balance will carry over each year up to a maximum cap of 112 hours.
11. Paid parental leave
12. Adoption assistance
13.
Employee Stock Purchase Plan:
14. Financial planning and group legal
15. Voluntary benefits including auto, homeowner and pet insurance
The role will generally accept applications for at least three calendar days from the posting date or as long as the job remains posted.
- Career Level
- IC4
As a member of the software engineering division, you will take an active role in the definition and evolution of standard practices and procedures. You will be responsible for defining and developing software for tasks associated with the developing, designing and debugging of software applications or operating systems.
浏览量
0
申请点击
0
Mock Apply
0
收藏
0
相似职位

Senior Platform Engineer
Fanatics · United States, US

Senior DevOps Engineer
Sword Health · United States

Sr. Dir, Supportability & Scale
GitHub · United States

Staff DevOps Engineer
Agiloft · United States

Senior Site Reliability Engineering Manager- CTJ- Secret (Cleared Environments)
Microsoft · United States, Washington, Redmond
关于Oracle

Oracle
PublicCloud applications and platform services.
140,000+
员工数
Austin
总部位置
$300B
企业估值
评价
10条评价
3.5
10条评价
工作生活平衡
2.8
薪酬
4.0
企业文化
3.2
职业发展
2.5
管理层
2.3
62%
推荐率
优点
Good compensation and benefits
Supportive team culture and colleagues
Flexible work arrangements
缺点
Poor management and leadership
Work-life balance challenges
Limited career advancement opportunities
薪资范围
31,728个数据点
Principal/L7
Principal/L7 · Senior Principal Consultant
1,776份报告
$205,852
年薪总额
基本工资
$181,648
股票
-
奖金
$24,204
$157,007
$275,085
面试评价
8条评价
难度
3.1
/ 5
时长
14-28周
体验
正面 0%
中性 75%
负面 25%
面试流程
1
Application Review
2
Recruiter Screen
3
Technical Phone Screen
4
Final Interview
5
Offer Decision
常见问题
Coding/Algorithm
Technical Knowledge
Behavioral/STAR
Past Experience
最新动态
Related Digital secures financing for $16 billion Oracle data center in Michigan - Reuters
Reuters
News
·
1w ago
Related Digital secures financing for $16 billion Oracle data center in Michigan - Yahoo Finance
Yahoo Finance
News
·
1w ago
Amazon Web Services Marketplace Adds Chainlink Crypto Oracle Services - Decrypt
Decrypt
News
·
1w ago
Friday BP: Oracle Park promotions this weekend - McCovey Chronicles
McCovey Chronicles
News
·
1w ago