refresh

トレンド企業

Trending

採用

JobsNavan

Senior AI Operations (AI Ops) Engineer

Navan

Senior AI Operations (AI Ops) Engineer

Navan

Tel-Aviv, Israel

·

On-site

·

Full-time

·

1w ago

Required Skills

Python

Terraform

AWS SageMaker

At Navan, we aren't building a single, generic chatbot. We are building a Composable AI Microservice Architecture, a swarm of hundreds of hyper-specialized AI services, each meticulously "programmed" to solve small, focused tasks with high precision. This fleet powers Ava, our AI support engine, and a suite of cutting-edge generative tools for travel and expense management.

As a Senior AI Operations (AI Ops) Engineer, you are the architect of the platform that makes this scale possible. You will move beyond traditional MLOps to manage a "factory" of Language Models. Your challenge is one of orchestration and standardization, ensuring that every service in the swarm meets a rigorous bar for quality, reliability, and cost-efficiency.

What You'll Do

  • Orchestrate the AI Fleet: Build and own the runtime environment for 100+ specialized AI services. Manage model routing, context versioning, and standardized memory/history stores.

  • High-Density Inference Optimization: Design and implement Sage Maker Multi-Model Endpoints (MME) and Inference Components to serve multiple tuned SLMs per GPU, maximizing hardware utilization while minimizing latency.

  • Deterministic Service Excellence: Treat reliability as a layered engineering problem. Build deterministic "shells" around probabilistic LM outputs, prioritizing data-layer validation and strict serialization.

  • Automated Evaluation & Observability: Implement "LLM-as-a-judge" patterns and automated benchmarking to detect semantic drift and hallucinations across the fleet before they impact the user.

  • Standardize the Workflow: Obsess over building reusable patterns and Terraform-based infrastructure that eliminate "snowflake" configurations, allowing us to deploy new specialized AI tasks in minutes.

  • Agency Strategy: Partner with AI Researchers to find the "Goldilocks zone" for agentic autonomy—balancing the flexibility of LLM tool-use with the precision required for production stability.

What We're Looking For

  • Experience: 5+ years in SRE, Platform Engineering, or MLOps, with at least 2 years focused on deploying LLMs/SLMs in production environments.

  • Sage Maker Mastery: Deep hands-on expertise with AWS Sage Maker, specifically configuring Multi-Model Endpoints (MME), Inference Components, and GPU-backed instances (G5/P4).

  • SLM Expertise: Proven experience with Small Language Models (e.g., Mistral, Llama 3, Phi) and parameter-efficient fine-tuning (PEFT) deployment strategies like LoRA/QLoRA.

  • Technical Stack: * *Languages: Strong proficiency in Python and Terraform.

  • Orchestration: Experience with Docker, Kubernetes (EKS), or AWS ECS/Fargate.

  • Data: Familiarity with Snowflake and Vector Databases.

  • The "AI Ops" Mindset: You understand that AI at scale is a statistical challenge. You are comfortable debugging issues at the data/serialization layer rather than defaulting to prompt tweaks.

  • CI/CD & Automation: Experience building robust pipelines (Jenkins, GitHub Actions) for non-deterministic software, including automated "eval" stages.

  • Education: BS or MS in Computer Science, Engineering, Mathematics, or a related technical field.

Must have

  • Python, Terraform, Sagemaker

Total Views

0

Apply Clicks

0

Mock Applicants

0

Scraps

0

About Navan

Navan

Navan

Series F+

Navan is a corporate travel and expense management platform that combines travel booking, expense reporting, and payment solutions for businesses.

1,001-5,000

Employees

Palo Alto

Headquarters

$9.2B

Valuation

Reviews

3.8

15 reviews

Work Life Balance

2.0

Compensation

3.5

Culture

1.5

Career

2.0

Management

1.0

15%

Recommend to a Friend

Pros

High compensation potential (600K TC mentioned)

Strong revenue growth (32% YoY to $613M)

Good net dollar retention (+110%)

Cons

Toxic work environment and culture

Terrible management at all levels

Engineering organization described as 'royal mess'

Salary Ranges

26 data points

Junior/L3

Mid/L4

Junior/L3 · Data Analyst

0 reports

$169,150

total / year

Base

-

Stock

-

Bonus

-

$143,778

$194,522

Interview Experience

3 interviews

Difficulty

3.0

/ 5

Duration

14-28 weeks

Interview Process

1

Application Review

2

Phone Screen

3

Loop Round Interview

4

Final Interview

5

Decision

Common Questions

Behavioral/STAR

Technical Knowledge

Past Experience

Culture Fit