refresh

Trending Companies

Trending

Jobs

JobsAMD

GPU Kernel Development Engineer

AMD

GPU Kernel Development Engineer

AMD

Shanghai

·

On-site

·

Full-time

·

1mo ago

Benefits & Perks

Competitive salary and equity package

Team events and activities

Professional development budget

Parental leave

Equity

Learning

Parental Leave

Required Skills

JavaScript

Python

React

WHAT YOU DO AT AMD CHANGES EVERYTHING:

At AMD, our mission is to build great products that accelerate next-generation computing experiences—from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create something extraordinary. When you join AMD, you’ll discover the real differentiator is our culture. We push the limits of innovation to solve the world’s most important challenges—striving for execution excellence, while being direct, humble, collaborative, and inclusive of diverse perspectives. Join us as we shape the future of AI and beyond.  Together, we advance your career.THE ROLE:
We are seeking a talented Machine Learning Kernel Developer to design, develop, and optimize low-level machine learning kernels for AMD GPUs using the ROCm software stack. In this role, you will work on high-impact projects to accelerate AI frameworks and libraries, with a focus on emerging technologies like Large Language Models (LLMs) and other generative AI workloads.

THE PERSON: The ideal candidate will have hands-on experience with GPU programming (ROCm or CUDA) and a passion for pushing the boundaries of AI performance.

KEY RESPONSIBILITIES:

  • Design and implement highly optimized ML kernels (e.g., matrix operations, attention mechanisms) for AMD GPUs using ROCm.
  • Profile, debug, and tune kernel performance to maximize hardware utilization for AI workloads.
  • Collaborate with ML researchers and framework developers to integrate kernels into AI frameworks (e.g., Py Torch, Tensor Flow) and inference engines (e.g., vLLM).
  • Contribute to the ROCm software stack by identifying and resolving bottlenecks in libraries like MIOpen, HIP, or Composable Kernel.
  • Stay updated on the latest AI/ML trends (LLMs, quantization, distributed inference) and apply them to kernel development.
  • Document and communicate technical designs, benchmarks, and best practices.
  • Troubleshoot and resolve issues related to GPU compatibility, performance, and scalability.

REQUIRED EXPERIENCE:

  • 2+ years of experience in GPU kernel development for machine learning (ROCm or CUDA).
  • Proficiency in C/C++ and Python, with experience in performance-critical programming.
  • Strong understanding of ML frameworks (Py Torch, Tensor Flow) and GPU-accelerated libraries.
  • Basic knowledge of modern AI technologies (LLMs, transformers, inference optimization).
  • Familiarity with parallel computing, memory optimization, and hardware architectures.
  • Problem-solving skills and ability to work in a fast-paced environment.

PREFERRED EXPERIENCE:

  • Direct experience with AMD ROCm development (HIP, MIOpen, Composable Kernel).
  • Knowledge of LLM-specific optimizations (e.g., Flash Attention, Paged Attention in vLLM).
  • Experience with distributed training/inference or model compression techniques.
  • Contributions to open-source ML projects or GPU compute libraries.

ACADEMIC CREDENTIALS:

  • Bachelor’s/Master’s in Computer Science, Electrical Engineering, or related field.

Benefits offered are described:  AMD benefits at a glance.

AMD does not accept unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services. AMD and its subsidiaries are equal opportunity, inclusive employers and will consider all applicants without regard to age, ancestry, color, marital status, medical condition, mental or physical disability, national origin, race, religion, political and/or third-party affiliation, sex, pregnancy, sexual orientation, gender identity, military or veteran status, or any other characteristic protected by law.   We encourage applications from all qualified candidates and will accommodate applicants’ needs under the respective laws throughout all stages of the recruitment and selection process.

AMD may use Artificial Intelligence to help screen, assess or select applicants for this position.  AMD’s “Responsible AI Policy” is available here.

This posting is for an existing vacancy.

Total Views

0

Apply Clicks

0

Mock Applicants

0

Scraps

0

About AMD

AMD

AMD

Public

A semiconductor company that designs and develops graphics units, processors, and media solutions

10,001+

Employees

Santa Clara

Headquarters

Reviews

3.5

25 reviews

Work Life Balance

3.2

Compensation

4.1

Culture

3.6

Career

3.4

Management

3.1

65%

Recommend to a Friend

Pros

Good compensation and benefits

Positive work environment

Great management and coworkers

Cons

Poor work life balance

Micromanagement and excessive tracking

Too much pressure and workload

Salary Ranges

6 data points

L2

L3

L4

L5

L6

L2 · Data Analyst L2

0 reports

$76,430

total / year

Base

$30,572

Stock

$38,215

Bonus

$7,643

$53,501

$99,359

Interview Experience

5 interviews

Difficulty

3.6

/ 5

Duration

14-28 weeks

Offer Rate

60%

Experience

Positive 20%

Neutral 20%

Negative 60%

Interview Process

1

Application Review

2

Recruiter Screen

3

Technical Phone Screen

4

Technical Interview

5

Hiring Manager Interview

6

Offer

Common Questions

Coding/Algorithm

Technical Knowledge

Behavioral/STAR

Past Experience

System Design