招聘
NVIDIA are seeking dynamic Solution Architects with specialized expertise in training Large Language Models (LLMs), implementing RAG workflows, and agentic inference. You will leverage the full NVIDIA software & hardware ecosystem to design, optimize, and deliver production-grade generative AI solutions for enterprise customers. With competitive salaries and a generous benefits package, we are widely considered to be one of the world’s most desirable employers! We have some of the most forward-thinking and hardworking people in the world working for us and, due to outstanding growth, our best-in-class engineering teams are rapidly growing. If you're a creative and autonomous person with a real passion for technology, we want to hear from you.
What You Will Be Doing:
-
Architect end-to-end solutions focused on LLM pretraining, fine-tuning, high-performance inference, RAG workflows, and agentic inference orchestration using NVIDIA’s hardware and software platforms.
-
Collaborate with customers to understand their LLM-related business challenges and design tailored solutions aligned with the NVIDIA ecosystem.
-
Lead LLM training, distributed optimization, and performance tuning to achieve optimal throughput, latency, and memory efficiency.
-
Design and integrate RAG workflows and agentic inference pipelines into customer systems; provide technical guidance on best practices.
-
Collaborate with NVIDIA engineering teams to provide feedback and support pre-sales technical activities (workshops, demos).
What We Need to See:
-
Master’s / Ph.D. in Computer Science, Artificial Intelligence, or equivalent experience.
-
4+ years hands-on experience in AI, focusing on open-source LLM training, fine-tuning, and production inference optimization.
-
Deep understanding of mainstream LLM architectures and proficiency in LLM customization via Py Torch, Hugging Face Transformers.
-
Solid knowledge of GPU computing, cluster architecture, and distributed parallel training/inference for LLMs.
-
Competency in agentic inference design and using AI agents to solve business challenges.
-
Strong communication skills, able to articulate complex technical concepts to technical and non-technical stakeholders.
Ways to Stand Out from the Crowd:
-
Hands-on experience with NVIDIA’s generative AI ecosystem (TRT-LLM, Megatron-LM, NVIDIA Ne Mo).
-
Advanced skills in LLM optimization (quantization, KV Cache tuning, memory footprint reduction).
-
Experience with Docker, Kubernetes for containerized LLM and agent workflow deployment on-prem.
-
In-depth knowledge of multi-GPU parallelism and large-scale GPU cluster management.
#deeplearning
总浏览量
0
申请点击数
0
模拟申请者数
0
收藏
0
相似职位

Software Engineer, Machine Learning Tooling
Waymo · Taipei, Taiwan; Hsinchu, Taiwan

AI/ML Scientist
Maersk · China, Shanghai, Shanghai, 200003

Software Engineer, Search, Ranking and Applied Machine Learning
Google ·

Applied Scientist 2
Microsoft · China, Beijing, Beijing; China, Jiangsu, Suzhou

Gen AI Engineer_Python
Infosys · Charlotte, NC
关于NVIDIA

NVIDIA
PublicA computing platform company operating at the intersection of graphics, HPC, and AI.
10,001+
员工数
Santa Clara
总部位置
$4.57T
企业估值
评价
4.1
10条评价
工作生活平衡
3.5
薪酬
4.2
企业文化
4.3
职业发展
4.5
管理层
4.0
75%
推荐给朋友
优点
Great culture and supportive environment
Smart colleagues and excellent people
Cutting-edge technology and learning opportunities
缺点
Team-dependent experience and outcomes
Work-life balance issues with long hours
Politics and influence over competence
薪资范围
73个数据点
L3
L4
L5
L3 · Data Scientist IC2
0份报告
$177,542
年薪总额
基本工资
-
股票
-
奖金
-
$150,910
$204,174
面试经验
7次面试
难度
3.1
/ 5
体验
正面 0%
中性 86%
负面 14%
面试流程
1
Application Review
2
Recruiter Screen
3
Online Assessment
4
Technical Interview
5
System Design Interview
6
Team Review
常见问题
Coding/Algorithm
System Design
Technical Knowledge
Behavioral/STAR
新闻动态
Negotiating NVIDIA's Offer
Base, stock, and sign-on negotiable. Recruiters invested in closing candidates. CEO reviews all 42K employee salaries monthly. Stock growth has made many employees millionaires.
News
·
NaNw ago
NVIDIA Company Reviews
WLB rated 3.9/5 (lowest category). 64% satisfied with WLB but 53% feel burnt out. Compensation rated 4.4-4.5/5. Experience highly team-dependent.
News
·
NaNw ago
NVIDIA Interview Discussions
Technical bar is high with 4-6 rounds. Process takes 4-8 weeks. Expect C++ questions, LeetCode medium, and system design. Difficulty rated 3.16/5.
News
·
NaNw ago
NVIDIA Culture Discussions
Team-dependent experience; sink-or-swim culture that rewards high performers but can be overwhelming. No politics, flat structure, but demanding workload with some teams requiring evening/weekend work.
News
·
NaNw ago