招聘
NVIDIA is seeking a motivated AI Acceleration & Optimization Engineer to join our Acceleration Computing, Optimization and Tools (ACOT) team. In this role, you will help improve the performance, scalability, and efficiency of modern AI models across NVIDIA GPU platforms. You will work with engineers across algorithms, systems, and hardware to support high-performance model deployment and development for real-world AI workloads.
As part of ACOT, you will collaborate with architecture, research, CUDA, compiler, and framework teams to help bring next-generation AI workloads from research to production with strong performance and reliability.
What you will be doing
- Assist in optimizing AI models such as LLMs, VLMs, diffusion models, and multimodal models for inference and training on NVIDIA GPUs.
- Profile workloads and help identify performance bottlenecks across GPU compute, memory, networking, and storage.
- Support the development and integration of optimization techniques such as quantization, kernel fusion, parallelism, and memory efficiency improvements.
- Use tools including CUDA, TensorRT, Nsight, and NVIDIA acceleration libraries to analyze and improve model performance.
- Work with deep learning frameworks including Py Torch, JAX, and Tensor Flow, as well as open-source inference frameworks like vLLM and SGLang.
- Contribute to performance benchmarking, testing, and internal tooling to improve optimization workflows.
- Partner with senior engineers and multi-functional teams to evaluate workload behavior and support future performance improvements.
What we want to see
- Bachelor’s or Master’s degree in Computer Science, Electrical Engineering, Computer Engineering, or related field (or equivalent experience).
- 2–4 years of experience, or strong academic/project experience, in deep learning, performance engineering, systems, or high-performance computing.
- Good understanding of deep learning fundamentals and modern AI model architectures, especially transformers.
- Familiarity with GPU architecture and parallel computing concepts such as CUDA, kernels, memory hierarchy, and streams.
- Exposure to profiling and performance analysis tools.
- Programming skills in Python.
- Experience with at least one major ML framework such as Py Torch, Tensor Flow, or JAX.
Ways to stand out from the crowd
- Internship, research, or project experience optimizing AI/ML workloads on GPUs.
- Hands-on experience with TensorRT, TensorRT-LLM, vLLM, SGLang, or similar inference/runtime frameworks.
- Familiarity with quantization, sparsity, or mixed-precision techniques.
- Experience with distributed training or inference concepts. Contributions to open-source ML systems, performance tools, or infrastructure projects.
- Proficiency in C++, strong debugging skills and interest in low-level performance optimization.
总浏览量
0
申请点击数
0
模拟申请者数
0
收藏
0
相似职位

Machine Learning Engineer - Data Recommendation (Capcut)- San Jose
TikTok · San Jose, CA

Helix AI Engineer, Modeling
Figure AI · San Jose, CA

Machine Learning Engineer - Vision Products Software
Apple · Los Angeles, CA

Software Engineer III, AI/ML GenAI, GCP, Performance
Google ·

AI ENGINEER L1
Wipro · Pune, India
关于NVIDIA

NVIDIA
PublicA computing platform company operating at the intersection of graphics, HPC, and AI.
10,001+
员工数
Santa Clara
总部位置
$4.57T
企业估值
评价
4.1
10条评价
工作生活平衡
3.5
薪酬
4.2
企业文化
4.3
职业发展
4.5
管理层
4.0
75%
推荐给朋友
优点
Great culture and supportive environment
Smart colleagues and excellent people
Cutting-edge technology and learning opportunities
缺点
Team-dependent experience and outcomes
Work-life balance issues with long hours
Politics and influence over competence
薪资范围
73个数据点
L3
L4
L5
L3 · Data Scientist IC2
0份报告
$177,542
年薪总额
基本工资
-
股票
-
奖金
-
$150,910
$204,174
面试经验
7次面试
难度
3.1
/ 5
体验
正面 0%
中性 86%
负面 14%
面试流程
1
Application Review
2
Recruiter Screen
3
Online Assessment
4
Technical Interview
5
System Design Interview
6
Team Review
常见问题
Coding/Algorithm
System Design
Technical Knowledge
Behavioral/STAR
新闻动态
Negotiating NVIDIA's Offer
Base, stock, and sign-on negotiable. Recruiters invested in closing candidates. CEO reviews all 42K employee salaries monthly. Stock growth has made many employees millionaires.
News
·
NaNw ago
NVIDIA Company Reviews
WLB rated 3.9/5 (lowest category). 64% satisfied with WLB but 53% feel burnt out. Compensation rated 4.4-4.5/5. Experience highly team-dependent.
News
·
NaNw ago
NVIDIA Interview Discussions
Technical bar is high with 4-6 rounds. Process takes 4-8 weeks. Expect C++ questions, LeetCode medium, and system design. Difficulty rated 3.16/5.
News
·
NaNw ago
NVIDIA Culture Discussions
Team-dependent experience; sink-or-swim culture that rewards high performers but can be overwhelming. No politics, flat structure, but demanding workload with some teams requiring evening/weekend work.
News
·
NaNw ago