refresh

트렌딩 기업

트렌딩 기업

채용

채용Cadence

Data Center Operations Engineer

Cadence

Data Center Operations Engineer

Cadence

Santa Fe, New Mexico

·

On-site

·

Full-time

·

2mo ago

복지 및 혜택

Healthcare

Equity

필수 스킬

SAP

Python

Excel

At Cadence, we hire and develop leaders and innovators who want to make an impact on the world of technology.

Job Summary:

The Data Center Operations Engineer is responsible for supporting, maintaining, and deploying critical data center infrastructure with a strong focus on

Linux-based systems, GPU server deployments, and Infini Band networking. This role requires hands-on expertise in data center operations, cluster bring-up, hardware installation, and troubleshooting across compute, network, and GPU environments. The engineer will collaborate closely with global infrastructure, development, and operations teams to ensure reliable, secure, and scalable service delivery.

  • Key Responsibilities

  • Provide hands-on operational support for all data center projects, deployments, and repair activities.

  • Participate in an on-call rotation and provide on-site or remote support during maintenance windows and incidents.

  • Troubleshoot and resolve operational issues related to Linux servers, GPU platforms, networking, and storage infrastructure.

  • Support customer and internal deployments, ensuring timely and successful bring-up of GPU servers and clusters.

  • Perform Infini Band fabric bring-up, switch configuration, subnet management, and troubleshooting.

  • Conduct daily health checks of Linux systems and infrastructure components, proactively identifying and mitigating risks.

  • Install, configure, test, and maintain server hardware (rack and stack, labeling, HDDs, memory, CPUs, RAID batteries, NICs, etc.).

  • Install, configure, and troubleshoot networking equipment including routers, switches, and terminal servers for out-of-band management.

  • Review and validate equipment deployments against approved design documentation and standards.

  • Support data center builds, refreshes, migrations, and expansions while adhering to quality and safety standards.

  • Coordinate with vendors and onsite staff for hardware delivery, diagnostics, replacement, and warranty services.

  • Utilize monitoring and alerting frameworks to identify issues, escalate appropriately, and ensure timely service restoration.

  • Maintain accurate documentation of operational procedures, system configurations, and runbooks.

  • Follow established incident management, escalation procedures, and service-level agreements (SLAs).

  • Collaborate with global teams across time zones to support operational initiatives and continuous improvement efforts.

  • Contribute to process improvement initiatives and ensure adherence to documented policies, processes, and procedures.

  • Required Qualifications

  • Bachelor’s degree in Computer Science, Engineering, Information Technology, or equivalent practical experience.

  • *Strong hands-on experience in Linux environments, including system administration, troubleshooting, and performance validation.

  • *Proficiency with Linux command-line tools and shell scripting (Bash or equivalent).

  • Experience with cluster bring-up, driver installation, and system-level configuration.

  • Hands-on experience setting up and validating GPU servers in clustered environments.

  • Experience with end-to-end GPU testing in Infini Band-based clusters.

  • Working knowledge of Infini Band networking, including switch configuration and subnet management.

  • Solid understanding of networking fundamentals, including the OSI model and TCP/IP protocol suite (IP, ARP, ICMP, TCP, UDP, SMTP, FTP, TFTP).

  • Experience installing, configuring, and troubleshooting routers, switches, and terminal servers.

  • Familiarity with fiber and copper cabling, including IP and SAN deployments.

  • Experience managing incident tickets, maintaining acceptable ticket loads, and meeting SLAs.

  • Strong organizational skills with meticulous attention to detail in data center environments.

  • Ability to follow and enforce documented escalation procedures and operational policies.

  • Strong verbal and written communication skills, with the ability to collaborate effectively with cross-functional and global teams.

  • Preferred Qualifications

  • Experience supporting HPC, AI, or large-scale GPU environments.

  • Exposure to data center monitoring

  • Experience documenting operational processes and maintaining technical runbooks.

  • Familiarity with large-scale data center buildouts or refresh programs.

  • Physical Requirements

  • Ability to perform the essential functions of the role, including lifting, moving, and installing equipment weighing 50 pounds or more, with or without reasonable accommodation.

  • Ability to work in data center environments, including raised floors, equipment racks, and confined spaces.

  • Willingness to work flexible hours, including nights, weekends, and on-call rotations as required.

  • Work Environment

  • On-site data center environment with occasional remote coordination.

  • Interaction with hardware vendors, service providers, and internal engineering teams.

  • Fast-paced operational setting requiring attention to detail, adherence to safety standards, and rapid problem resolution.

We’re doing work that matters. Help us solve what others can’t.

총 조회수

2

총 지원 클릭 수

0

모의 지원자 수

0

스크랩

0

Cadence 소개

Cadence

Cadence

Public

Cadence Design Systems provides electronic design automation (EDA) software, hardware, and IP for designing and verifying electronic systems and semiconductors.

5,001-10,000

직원 수

San Jose

본사 위치

$8.5B

기업 가치

리뷰

4.0

10개 리뷰

워라밸

4.2

보상

2.8

문화

4.1

커리어

3.2

경영진

3.4

72%

친구에게 추천

장점

Good work-life balance

Supportive and collaborative team environment

Flexible work arrangements

단점

Below market compensation

Limited career advancement opportunities

Heavy workload and long hours

연봉 정보

58개 데이터

Junior/L3

Junior/L3 · Data Analyst

1개 리포트

$91,103

총 연봉

기본급

$85,276

주식

-

보너스

$5,827

$59,612

$139,984

면접 경험

1개 면접

난이도

3.0

/ 5

소요 기간

14-28주

면접 과정

1

Application Review

2

Recruiter Screen

3

Technical Phone Screen

4

Onsite/Virtual Interviews

5

Final Decision

자주 나오는 질문

Technical Knowledge

Behavioral/STAR

Past Experience

Problem Solving