热门公司

Google
Google

Organizing the world's information and making it universally accessible.

Software Engineer, Data Center Infrastructure Management Lifecycle

职能基础设施
级别应届/初级
方式现场办公
类型全职
发布3个月前

薪酬

$141,000 - $202,000

立即申请

福利待遇

股权

弹性工作

育儿假

医疗保险

Learning Budget

必备技能

Node.js

Python

JavaScript

About the job

The DCIM Lifecycle team operates one of the largest-scale monitoring systems at Google, reading telemetry from millions of devices in every Google datacenter. Our issues include managing the rapid growth and diversification of the Google fleet and hardware, new use cases for critical monitoring of third-party facilities, and retiring technical debt.

Google is bringing back tape libraries to our data centers in order to support various critical requirements including new cold storage tier, better TCO, contingency for HDD/SSD shortage due to unprecedented AI/ML capacity demand. This role is to design and delivery Tape Health at Google scale for reliability.

In this role, you will work with your teammates to design, code, and put into production very large-scale distributed monitoring systems and work with your team and partner teams to enable new use cases for large-scale telemetry gathering. You will also create various system monitoring dashboards, defining service level objectives (SLOs), documentation and playbooks. You will have the opportunity to take onsite trips to one or more of Google's datacenters each year to work with new systems and data center technical staff in person.The US base salary range for this full-time position is $141,000-$202,000 + bonus + equity + benefits. Our salary ranges are determined by role, level, and location. Within the range, individual pay is determined by work location and additional factors, including job-related skills, experience, and relevant education or training. Your recruiter can share more about the specific salary range for your preferred location during the hiring process.

Please note that the compensation details listed in US role postings reflect the base salary only, and do not include bonus, equity, or benefits. Learn more about benefits at Google.

Responsibilities

  • Design, develop, and maintain software services for collecting and analyzing telemetry data from tape libraries, drives, and robotic components.

  • Implement algorithms and rules to detect, diagnose, and predict hardware failures.

  • Integrate tape health systems with Google's data center health monitoring infrastructure (e.g., system health, network doctor) and automated repair workflows (e.g., surgeon, silk roads).

  • Collaborate with hardware engineers and vendors to understand failure modes and improve diagnostic capabilities.

  • Develop dashboards and tools to provide visibility into the health and status of the tape hardware fleet. Participate in the full software development lifecycle, including requirements gathering, design, coding, testing, deployment, and operation.

Minimum qualifications

  • Bachelor’s degree or equivalent practical experience.

  • 2 years of experience with coding in C++.

  • 1 year of experience with distributed computing.

  • 1 year of experience with debugging, troubleshooting and monitoring systems.

Preferred qualifications

  • Master's degree or PhD in Computer Science, or a related technical field.

  • 2 years of experience in unit testing, integration testing, and continuous deployment.

  • 2 years of experience in SQL.

浏览量

0

申请点击

0

Mock Apply

0

收藏

0

关于Google

Google

Google

Public

Google specializes in internet-related services and products, including search, advertising, and software.

10,001+

员工数

Mountain View

总部位置

$1,700B

企业估值

评价

10条评价

4.5

10条评价

工作生活平衡

3.2

薪酬

4.3

企业文化

4.1

职业发展

4.2

管理层

3.8

82%

推荐率

优点

Great benefits and perks

Innovative and interesting work

Career development and learning opportunities

缺点

High pressure and expectations

Long hours and heavy workload

Fast-paced and overwhelming environment

薪资范围

57,503个数据点

Mid/L4

Mid/L4 · Accessibility Analyst

1份报告

$214,500

年薪总额

基本工资

$165,000

股票

-

奖金

-

$214,500

$214,500

面试评价

9条评价

难度

3.4

/ 5

时长

14-28周

录用率

44%

体验

正面 0%

中性 56%

负面 44%

面试流程

1

Application Review

2

Online Assessment/Technical Screen

3

Phone Screen

4

Onsite/Virtual Interviews

5

Team Matching

6

Offer

常见问题

Coding/Algorithm

System Design

Behavioral/STAR

Technical Knowledge

Product Sense