refresh

トレンド企業

トレンド企業

採用

求人NVIDIA

Senior Software Engineer, NCCL

NVIDIA

Senior Software Engineer, NCCL

NVIDIA

China, Shanghai

·

On-site

·

Full-time

·

2d ago

NVIDIA is leading the way in groundbreaking developments in Artificial Intelligence, High-Performance Computing and Visualization. The GPU, our invention, serves as the visual cortex of modern computers and is at the heart of our products and services. Our work opens up new universes to explore, enables amazing creativity and discovery, and powers what were once science fiction inventions from artificial intelligence to autonomous cars. NVIDIA is looking for phenomenal people like you to help us accelerate the next wave of artificial intelligence.

We are looking for a highly motivated senior software engineer for an exciting role in our communication libraries and network software team. The position will be part of a fast-paced crew that develops and maintains software for complex heterogeneous computing systems that power disruptive products in High Performance Computing and Deep Learning.

What you will be doing:

  • Design, implement and maintain highly-optimized communication runtimes for Deep Learning frameworks (e.g. NCCL for Tensor Flow/Pytorch) and HPC programming interfaces (e.g. UCX for MPI/OpenSHMEM) on GPU clusters.

  • Participating in and contributing to parallel programming interface specifications like MPI/OpenSHMEM.

  • Design, implement and maintain system software that enables interactions among GPUs and interactions between GPUs and other system components.

  • Creating proof-of-concepts to evaluate and motivate extensions in programming models, new designs in runtimes and new features in hardware.

What we need to see:

  • M.S./Ph.D. degree in CS/CE or equivalent experience.

  • 5+ years of relevant experience.

  • Excellent C/C++ programming and debugging skills.

  • Strong experience with Linux.

  • Expert understanding of computer system architecture and operating systems.

  • Experience with parallel programming interfaces and communication runtimes.

  • Ability and flexibility to work and communicate effectively in a multi-national, multi-time-zone corporate environment.

Ways to stand out from the crowd:

  • Deep knowledge of high-performance networks like Infini Band, RoCE etc.

  • Experience with HPC applications. Experience with Deep Learning Frameworks such Py Torch, Tensor Flow, JAX/XLA, vLLM/SGLang etc.

  • Experience with AI/DL communication patterns such as Expert Parallelism (EP), TP, DP, PP and how these patterns can be implemented with NCCL. Experience with CUDA kernel optimization and profiling.

  • Experience with large-scale model training and production inference software stack.

  • Strong collaborative and interpersonal skills, specifically a proven ability to effectively guide and influence within a dynamic matrix environment.

NVIDIA offers highly competitive salaries and a comprehensive benefits package. We have some of the most forward-thinking and talented people in the world working for us and, due to unprecedented growth, our world-class engineering teams are growing fast. If you're a creative and autonomous engineer with real passion for technology, we want to hear from you.

総閲覧数

0

応募クリック数

0

模擬応募者数

0

スクラップ

0

NVIDIAについて

NVIDIA

NVIDIA

Public

A computing platform company operating at the intersection of graphics, HPC, and AI.

10,001+

従業員数

Santa Clara

本社所在地

$4.57T

企業価値

レビュー

4.1

10件のレビュー

ワークライフバランス

3.5

報酬

4.2

企業文化

4.3

キャリア

4.5

経営陣

4.0

75%

友人に勧める

良い点

Great culture and supportive environment

Smart colleagues and excellent people

Cutting-edge technology and learning opportunities

改善点

Team-dependent experience and outcomes

Work-life balance issues with long hours

Politics and influence over competence

給与レンジ

73件のデータ

Junior/L3

Mid/L4

Junior/L3 · Analyst

7件のレポート

$170,275

年収総額

基本給

$130,981

ストック

-

ボーナス

-

$155,480

$234,166

面接体験

7件の面接

難易度

3.1

/ 5

体験

ポジティブ 0%

普通 86%

ネガティブ 14%

面接プロセス

1

Application Review

2

Recruiter Screen

3

Online Assessment

4

Technical Interview

5

System Design Interview

6

Team Review

よくある質問

Coding/Algorithm

System Design

Technical Knowledge

Behavioral/STAR