채용
필수 스킬
Python
PyTorch
Rust
C++
Leadership
System Design
ABOUT THE ROLE:
We are looking for an Inference Engineering Manager to lead our AI Inference team. This is a unique opportunity to build and scale the infrastructure that powers Perplexity's products and APIs, serving millions of users with state-of-the-art AI capabilities.
You will own the technical direction and execution of our inference systems while building and leading a world-class team of inference engineers. Our current stack includes Python, Py Torch, Rust, C++, and Kubernetes. You will help architect and scale the large-scale deployment of machine learning models behind Perplexity's Comet, Sonar, Search, Deep Research products.
WHY PERPLEXITY?
-
Build SOTA systems that are the fastest in the industry with cutting-edge technology
-
High-impact work on a smaller team with significant ownership and autonomy
-
Opportunity to build 0-to-1 infrastructure from scratch rather than maintaining legacy systems
-
Work on the full spectrum: reducing cost, scaling traffic, and pushing the boundaries of inference
-
Direct influence on technical roadmap and team culture at a rapidly growing company
RESPONSIBILITIES:
-
Lead and grow a high-performing team of AI inference engineers
-
Develop APIs for AI inference used by both internal and external customers
-
Architect and scale our inference infrastructure for reliability and efficiency
-
Benchmark and eliminate bottlenecks throughout our inference stack
-
Drive large sparse/MoE model inference at rack scale, including sharding strategies for massive models
-
Push the frontier with building inference systems to support sparse attention, disaggregated pre-fill/decoding serving, etc.
-
Improve the reliability and observability of our systems and lead incident response
-
Own technical decisions around batching, throughput, latency, and GPU utilization
-
Partner with ML research teams on model optimization and deployment
-
Recruit, mentor, and develop engineering talent
-
Establish team processes, engineering standards, and operational excellence
QUALIFICATIONS:
-
5+ years of engineering experience with 2+ years in a technical leadership or management role
-
Deep experience with ML systems and inference frameworks (Py Torch, Tensor Flow, ONNX, TensorRT, vLLM)
-
Strong understanding of LLM architecture: Multi-Head Attention, Multi/Grouped-Query Attention, and common layers
-
Experience with inference optimizations: batching, quantization, kernel fusion, Flash Attention
-
Familiarity with GPU characteristics, roofline models, and performance analysis
-
Experience deploying reliable, distributed, real-time systems at scale
-
Track record of building and leading high-performing engineering teams
-
Experience with parallelism strategies: tensor parallelism, pipeline parallelism, expert parallelism
-
Strong technical communication and cross-functional collaboration skills
NICE TO HAVE:
-
Experience with CUDA, Triton, or custom kernel development
-
Background in training infrastructure and RL workloads
-
Experience with Kubernetes and container orchestration at scale
-
Published work or contributions to inference optimization research
총 조회수
0
총 지원 클릭 수
0
모의 지원자 수
0
스크랩
0
비슷한 채용공고

Senior Engineering Manager, Compute
Crusoe · San Francisco, CA - US

Software Engineering Manager, AI Observability & Evals Platform (San Francisco, CA)
LangChain · San Francisco, CA

Field Engineering Manager, Public Sector
Scale AI · San Francisco, CA; St. Louis, MO; New York, NY; Washington, DC

Senior Engineering Manager, Reinforcement Learning Environments (RLE)
Handshake · San Francisco, CA

Engineering Manager, Onboarding
Brex · San Francisco, California, United States
Perplexity AI 소개

Perplexity AI
Series BPerplexity AI, Inc., or simply Perplexity, is an American privately held software company offering a web search engine that processes user queries and synthesizes responses.
51-200
직원 수
San Francisco
본사 위치
$1B
기업 가치
리뷰
3.8
10개 리뷰
워라밸
3.2
보상
2.5
문화
4.0
커리어
2.5
경영진
2.8
65%
친구에게 추천
장점
Supportive team and management
Good work-life balance and flexibility
Cutting-edge technology and interesting projects
단점
Low compensation compared to industry standards
Poor management and lack of leadership direction
Fast-paced and overwhelming workload
연봉 정보
26개 데이터
Junior/L3
Junior/L3 · LLM Teacher
1개 리포트
$101,920
총 연봉
기본급
$78,400
주식
-
보너스
-
$101,920
$101,920
면접 경험
1개 면접
난이도
4.0
/ 5
소요 기간
14-28주
경험
긍정 0%
보통 0%
부정 100%
면접 과정
1
Application Review
2
HR Screen
3
Take-home Marketing Challenge
4
Hiring Manager Interview
5
Panel Interview
6
Offer
자주 나오는 질문
Digital Marketing Strategy
Campaign Performance Analysis
Behavioral/STAR
Technical Marketing Knowledge
Case Study
뉴스 & 버즈
Perplexity launches Personal Computer that brings AI agents Directly on your Mac - The Times of India
The Times of India
News
·
3d ago
"Perplexity" Unveils a Broader Vision for the Role of Artificial Intelligence in Personal Computing - وكالة صدى نيوز
وكالة صدى نيوز
News
·
3d ago
Perplexity AI Cheat Sheet: How an ‘Answer Engine’ Is Challenging Gemini, ChatGPT - eWeek
eWeek
News
·
4d ago
Perplexity priced me out of its OpenClaw clone - PCWorld
PCWorld
News
·
4d ago