refresh

트렌딩 기업

트렌딩 기업

채용

채용Perplexity AI

Engineering Manager - Inference

Perplexity AI

Engineering Manager - Inference

Perplexity AI

San Francisco

·

On-site

·

Full-time

·

1mo ago

필수 스킬

Python

PyTorch

Rust

C++

Leadership

System Design

ABOUT THE ROLE:

We are looking for an Inference Engineering Manager to lead our AI Inference team. This is a unique opportunity to build and scale the infrastructure that powers Perplexity's products and APIs, serving millions of users with state-of-the-art AI capabilities.

You will own the technical direction and execution of our inference systems while building and leading a world-class team of inference engineers. Our current stack includes Python, Py Torch, Rust, C++, and Kubernetes. You will help architect and scale the large-scale deployment of machine learning models behind Perplexity's Comet, Sonar, Search, Deep Research products.

WHY PERPLEXITY?

  • Build SOTA systems that are the fastest in the industry with cutting-edge technology

  • High-impact work on a smaller team with significant ownership and autonomy

  • Opportunity to build 0-to-1 infrastructure from scratch rather than maintaining legacy systems

  • Work on the full spectrum: reducing cost, scaling traffic, and pushing the boundaries of inference

  • Direct influence on technical roadmap and team culture at a rapidly growing company

RESPONSIBILITIES:

  • Lead and grow a high-performing team of AI inference engineers

  • Develop APIs for AI inference used by both internal and external customers

  • Architect and scale our inference infrastructure for reliability and efficiency

  • Benchmark and eliminate bottlenecks throughout our inference stack

  • Drive large sparse/MoE model inference at rack scale, including sharding strategies for massive models

  • Push the frontier with building inference systems to support sparse attention, disaggregated pre-fill/decoding serving, etc.

  • Improve the reliability and observability of our systems and lead incident response

  • Own technical decisions around batching, throughput, latency, and GPU utilization

  • Partner with ML research teams on model optimization and deployment

  • Recruit, mentor, and develop engineering talent

  • Establish team processes, engineering standards, and operational excellence

QUALIFICATIONS:

  • 5+ years of engineering experience with 2+ years in a technical leadership or management role

  • Deep experience with ML systems and inference frameworks (Py Torch, Tensor Flow, ONNX, TensorRT, vLLM)

  • Strong understanding of LLM architecture: Multi-Head Attention, Multi/Grouped-Query Attention, and common layers

  • Experience with inference optimizations: batching, quantization, kernel fusion, Flash Attention

  • Familiarity with GPU characteristics, roofline models, and performance analysis

  • Experience deploying reliable, distributed, real-time systems at scale

  • Track record of building and leading high-performing engineering teams

  • Experience with parallelism strategies: tensor parallelism, pipeline parallelism, expert parallelism

  • Strong technical communication and cross-functional collaboration skills

NICE TO HAVE:

  • Experience with CUDA, Triton, or custom kernel development

  • Background in training infrastructure and RL workloads

  • Experience with Kubernetes and container orchestration at scale

  • Published work or contributions to inference optimization research

총 조회수

0

총 지원 클릭 수

0

모의 지원자 수

0

스크랩

0

Perplexity AI 소개

Perplexity AI

Perplexity AI, Inc., or simply Perplexity, is an American privately held software company offering a web search engine that processes user queries and synthesizes responses.

51-200

직원 수

San Francisco

본사 위치

$1B

기업 가치

리뷰

3.8

10개 리뷰

워라밸

3.2

보상

2.5

문화

4.0

커리어

2.5

경영진

2.8

65%

친구에게 추천

장점

Supportive team and management

Good work-life balance and flexibility

Cutting-edge technology and interesting projects

단점

Low compensation compared to industry standards

Poor management and lack of leadership direction

Fast-paced and overwhelming workload

연봉 정보

26개 데이터

Junior/L3

Junior/L3 · LLM Teacher

1개 리포트

$101,920

총 연봉

기본급

$78,400

주식

-

보너스

-

$101,920

$101,920

면접 경험

1개 면접

난이도

4.0

/ 5

소요 기간

14-28주

경험

긍정 0%

보통 0%

부정 100%

면접 과정

1

Application Review

2

HR Screen

3

Take-home Marketing Challenge

4

Hiring Manager Interview

5

Panel Interview

6

Offer

자주 나오는 질문

Digital Marketing Strategy

Campaign Performance Analysis

Behavioral/STAR

Technical Marketing Knowledge

Case Study