refresh

Trending Companies

Trending

Jobs

JobsMicrosoft

Senior Software Engineer

Microsoft

Senior Software Engineer

Microsoft

China, Beijing, Beijing; China, Jiangsu, Suzhou

·

On-site

·

Full-time

·

6d ago

Overview

The R&D of Search Ads aims to build an online advertising ecosystem of users, advertisers, and the search engine.

Bing Search Ads Understanding team is chartered to deliver world class algorithm using web scale data. Our mission is to drive user satisfaction, advertiser ROI and Bing revenue. A core challenge is to match advertisers' "Ad display" and users' "query" by build an intelligent system to really understand the users need. This is a very hard problem that demands the most advanced AI models and sophisticated engineering systems. Join us to work on projects highly strategic to Bing search in a fun and fast-paced environment!

We are hiring a Senior Software Engineer(GPU Inference Optimization) to work on GPU inference optimization of language models to support the GPU serving of the models for Ads tasks including query rewrite, Ad relevance and Ad creative generation, etc. As a member of this team, you will have the opportunity to work on the fundamental abstractions, programming models, runtimes, libraries and APIs to enable large scale inferencing and online serving of models on novel AI hardware.

This is a technical role focused on GPU inference optimization of language models: it requires hands-on software development skills. We’re looking for someone who has a demonstrated history of solving hard technical problems and is motivated to tackle the hardest problems in building a full end-to-end AI stack. An entrepreneurial approach and ability to take initiative and move fast are essential.

In alignment with our Microsoft values, we are committed to cultivating an inclusive work environment for all employees to positively impact our culture every day.

Microsoft's mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.

  • Responsibilities- Design, develop, and maintain high-performance software in C/C++ and Python, including GPU programming with CUDA, ROCm, or Triton.
  • Optimize model inference and training pipelines for speed, throughput, memory efficiency, and cost across GPU platforms.
  • Collaborate with platform teams to integrate and tune solutions on emerging accelerator stacks and rapidly evolving toolchains.
  • Profile workloads end-to-end, identify bottlenecks, and implement kernel-level and system-level performance improvements.
  • Partner with internal and external stakeholders to translate requirements into scalable performance features and optimizations for state-of-the-art models.
  • Validate performance, stability, and correctness through benchmarking, automated testing, and production readiness reviews.

Qualifications:

Required Qualifications:

  • Bachelor's Degree in Computer Science or related technical field AND 4+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, Python, CUDA, or ROCmOR equivalent experience.
  • 3+ years' practical experience working on applications that use GPUs, experience in optimizing their performance.
  • Practical Experience writing new GPU kernels, going beyond experience of GPU workloads with existing library kernels.
  • Cross-team collaboration skills and the desire to collaborate in a team of researchers and developers.

Preferred Qualifications:

  • Bachelor's Degree in Computer Science or related technical field AND 8+ years technical engineering experience with coding in languages including, but not limited to, C/C++, CUDA, or ROCmOR Master's Degree in Computer Science or related technical field AND 2+ years technical engineering experience with coding in languages including, but not limited to, C/C++, CUDA, or ROCm
  • OR equivalent experience.
  • Experience in low-level performance analysis and optimization, including proficiency using GPU profiling tools such as NVIDIA Visual Profiler, and NVIDIA Nsight Compute.
  • Technical background and solid foundation in software engineering principles and architecture design.
  • Familiar with inference optimization, experience in developing popular inference framework such as TensorRT-LLM, SGLang, vLLM.
  • Exposure to Deep Neural Network inference and experience in one or more deep learning frameworks such as Py Torch, Tensorflow, or ONNX Runtime.

This position will be open for a minimum of 5 days, with applications accepted on an ongoing basis until the position is filled.

Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, citizenship, color, family or medical care leave, gender identity or expression, genetic information, immigration status, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran or military status, race, ethnicity, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable local laws, regulations and ordinances. If you need assistance with religious accommodations and/or a reasonable accommodation due to a disability during the application process, read more about requesting accommodations.

Total Views

0

Apply Clicks

0

Mock Applicants

0

Scraps

0

About Microsoft

Microsoft

A software corporation that develops, manufactures, licenses, supports, and sells a range of software products and services.

10,001+

Employees

Redmond

Headquarters

$3000B

Valuation

Reviews

3.8

5 reviews

Work Life Balance

4.1

Compensation

4.3

Culture

3.4

Career

3.2

Management

3.0

65%

Recommend to a Friend

Pros

Excellent compensation and benefits package

Four-day workweek with improved work-life balance

Supportive managers and teams

Cons

High-pressure environment causing anxiety

Unprofessional interview processes

Limited creative work opportunities

Salary Ranges

5,571 data points

Junior/L3

Mid/L4

Junior/L3 · Advertising Client Success

2 reports

$163,358

total / year

Base

$141,875

Stock

-

Bonus

-

$163,358

$163,358

Interview Experience

7 interviews

Difficulty

3.7

/ 5

Duration

14-28 weeks

Offer Rate

14%

Experience

Positive 14%

Neutral 29%

Negative 57%

Interview Process

1

Application Review

2

Recruiter Screen

3

Technical Phone Screen

4

Technical Interview

5

Onsite/Virtual Interviews

6

Final Round

7

Offer

Common Questions

Coding/Algorithm

System Design

Behavioral/STAR

Technical Knowledge

Past Experience