refresh

Trending companies

Trending companies

Jobs

JobsZoox

PhD Research Intern, Multi-Modal Foundation Encoder for Perception

Zoox

PhD Research Intern, Multi-Modal Foundation Encoder for Perception

Zoox

Foster City, CA

·

On-site

·

Internship

·

Today

About Our Internship Program

Zoox’s internship program offers hands-on experience with cutting-edge technology, mentorship from some of the industry’s brightest minds, and the opportunity to make meaningful contributions to real projects. We seek interns who demonstrate strong academic performance, engagement beyond the classroom, intellectual curiosity, and a genuine interest in Zoox’s mission.

Project Overview

During this internship, you will lead the development of a multi-modality (vision, LiDAR, Radar, and language), temporal foundation encoder to support 3D object detection & tracking, 3D segmentation (occupancy), and live maps. This Multi-Modal Foundation Encoder (MMFE) is a critical key to achieving End-to-End Perception at Zoox.

Your research will aim to significantly improve system performance on long-tail events and rare classes by utilizing a large-capacity foundation model to learn rich representations across different sensor modalities. Additionally, the project aims to improve perception in adverse environmental conditions (such as medium to heavy rain and fog, reducing false positives on water splashes or dust particles) , achieve long-range sensing for highway driving , and build robustness to occlusion.

This is a highly research-driven role with the goal of publication. You will have the opportunity to explore novel directions such as tri-modal foundation models with self-supervised pre-training, radar-language grounding for zero-shot detection, efficient sensor fusion via sparse cross-attention, or integrating 3D Gaussian Splats for dynamic agent geometry and streaming sparse Gaussian occupancy prediction.

Total Views

0

Apply Clicks

0

Weekly mock applicants

0

Bookmarks

0

About Zoox

Zoox

Zoox

Acquired

Provide mobility as a service.

501-1,000

Employees

Foster City

Headquarters

$1.3B

Valuation

Reviews

3.7

10 reviews

Work-life balance

2.5

Compensation

3.8

Culture

4.0

Career

3.2

Management

2.8

65%

Recommend to a friend

Pros

Innovative and cutting-edge technology projects

Supportive and collaborative team environment

Good compensation and benefits

Cons

Fast-paced work environment

Poor work-life balance and long hours

High stress levels and workload

Salary Ranges

139 data points

Mid/L4

Senior/L5

Intern

Mid/L4 · Data Scientist

2 reports

$252,794

total per year

Base

$194,457

Stock

-

Bonus

-

$249,600

$256,188

Interview experience

3 interviews

Difficulty

3.0

/ 5

Duration

14-28 weeks

Interview process

1

Application Review

2

Recruiter Screen

3

Technical Phone Screen

4

Hiring Manager Interview

5

Final Interview Rounds

Common questions

Coding/Algorithm

Technical Knowledge

Behavioral/STAR

Machine Learning Concepts

Data Analysis