refresh

Trending Companies

Trending

Jobs

JobsNVIDIA

AI Computing Performance Architect Intern, Perf Analysis and Kernel Dev - 2026

NVIDIA

AI Computing Performance Architect Intern, Perf Analysis and Kernel Dev - 2026

NVIDIA

China, Shanghai

·

On-site

·

Full-time

·

1mo ago

Benefits & Perks

Top Tier compensation with equity

Annual team offsites

Remote work flexibility

Health, dental, and vision coverage

Flexible PTO policy

Wellness benefits

Required Skills

SQL

Airflow

Apache Spark

NVIDIA is developing processor and system architectures that accelerate machine learning, automotive and high performance computing applications. We are seeking a strong candidate to do performance analysis and kernels development for NVIDIA's new architectures. Your work will play a critical role in shaping the future of deep learning hardware and software, ensuring optimal performance for next-generation AI applications.  This position offers the opportunity to make a meaningful impact in a fast-moving, technology focused company.

What you'll be doing:

  • Design, develop, and optimize major layers in LLM (e.g attention, GEMM, inter-GPU communication) for NVIDIA's new architectures.

  • Implement and fine-tune kernels to achieve optimal performance on NVIDIA GPUs.

  • Conduct in-depth performance analysis of GPU kernels, including Attention and other critical operations.

  • Identify bottlenecks, optimize resource utilization, and improve throughput, and power efficiency

  • Create and maintain workloads and micro-benchmark suites to evaluate kernel performance across various hardware and software configurations.

  • Generate performance projections, comparisons, and detailed analysis reports for internal and external stakeholders.

  • Collaborate with architecture, software, and product teams to guide the development of next-generation deep learning hardware and software.

What we need to see:

  • Pursuing BS, MS or PhD in relevant discipline (CS, EE, CE).

  • Strong software skills with C/C++, Python, MPI, OpenMP etc.

  • Solid computer science (CS) SW & HW arch background.

  • Experience of DL workload and operator performance will be a plus.

  • Familiarity with GPU computing and parallel programming models will be a plus.

  • Excellent oral and written communication skills.

  • Good organizational, time management and task prioritization skills.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

#deeplearning

Total Views

0

Apply Clicks

0

Mock Applicants

0

Scraps

0

About NVIDIA

NVIDIA

NVIDIA

Public

A computing platform company operating at the intersection of graphics, HPC, and AI.

10,001+

Employees

Santa Clara

Headquarters

$4.57T

Valuation

Reviews

4.1

10 reviews

Work Life Balance

3.5

Compensation

4.2

Culture

4.3

Career

4.5

Management

4.0

75%

Recommend to a Friend

Pros

Great culture and supportive environment

Smart colleagues and excellent people

Cutting-edge technology and learning opportunities

Cons

Team-dependent experience and outcomes

Work-life balance issues with long hours

Politics and influence over competence

Salary Ranges

47 data points

Junior/L3

Mid/L4

Junior/L3 · Analyst

7 reports

$170,275

total / year

Base

$130,981

Stock

-

Bonus

-

$155,480

$234,166

Interview Experience

7 interviews

Difficulty

3.1

/ 5

Experience

Positive 0%

Neutral 86%

Negative 14%

Interview Process

1

Application Review

2

Recruiter Screen

3

Online Assessment

4

Technical Interview

5

System Design Interview

6

Team Review

Common Questions

Coding/Algorithm

System Design

Technical Knowledge

Behavioral/STAR