採用
Benefits & Perks
•Flexible Work Arrangements
•Remote Work
•Flexible Hours
•Remote Work
Required Skills
Python
LangChain
LlamaIndex
Hugging Face
MLflow
Kubernetes
Docker
Machine Learning
LLM
MLOps
About The Job:
We are seeking a visionary and hands-on Senior AI Technical Lead to spearhead our Generative AI initiatives. While many can build a prototype, you are the expert who can take it to production. This role focuses on the end-to-end lifecycle of GenAI: from high-performance inference hosting and automated MLOps pipelines to rigorous model benchmarking and safety guardrails.
You will lead a high-performing team to design systems that are not only intelligent but are scalable, cost-optimized, and ethically governed.
What Will You Do:
MLOps & High-Performance Inference:
- Inference Server Management: Architect and optimize model serving using high-throughput engines like vLLM, NVIDIA Triton Inference Server, or TGI (Text Generation Inference).
- Scalable Hosting: Deploy and manage LLMs on Kubernetes (K8s), implementing auto-scaling based on concurrency and token throughput.
- MLOps Pipelines: Build robust CI/CD/CT (Continuous Testing) pipelines for model deployment, versioning, and rollback strategies.
- Resource Optimization: Implement model optimization techniques such as quantization (AWQ, GPTQ), LoRA/QLoRA adapters, and caching strategies to minimize latency and GPU costs.
Evaluation, Benchmarking & Guardrails
- Model Benchmarking: Establish a systematic framework for benchmarking LLMs/SLMs against industry standards (e.g., MMLU, Human Eval) and custom business-specific datasets.
- Automated Evaluation: Lead the implementation of LLM-as-a-judge workflows and evaluation frameworks like RAGAS, Deep Eval, or Lang Smith to measure relevance, faithfulness, and noise robustness.
- AI Guardrails: Design and deploy real-time safety layers using Ne Mo Guardrails, Guardrails AI, or Llama Guard to prevent hallucinations, PII leakage, and toxic outputs.
- A/B Testing: Design experimentation frameworks to compare model versions, prompt iterations, and RAG architectures in live environments.
Architecture & Patterns:
- Production Patterns: Design multilayered micro-services that integrate with Whats App, Instagram, and web platforms via robust API gateways.
- Observability: Implement deep monitoring for Token-per-Second (TPS), Time-To-First-Token (TTFT), and cost-per-request using Prometheus, Grafana, or specialized AI observability tools.
- Data Ingestion for RAG: Build automated, secure pipelines for data chunking, embedding generation, and vector database synchronization (Pinecone, Weaviate, or Milvus).
Technical Leadership
- Team Mentorship: Lead a team of 10-12 engineers, fostering a culture of Production-First AI development.
- Strategic Roadmap: Drive the technical vision for internal AI tooling, including prompt libraries and model registries.
- Stakeholder Collaboration: Translate complex performance metrics (like P99 latency) into business impact for product managers and executives.
What You Will Bring
- Experience: 8–10 years in AI/ML development, with 3 years focused specifically on LLM productionization and MLOps.
- Inference Expertise: Proven track record of serving large models in production environments (local or cloud-hosted).
Deep Tech Stack:
- Languages: Expert-level Python.
- Frameworks: Lang Chain, Llama Index, Hugging Face (Transformers/PEFT/Accelerate).
- MLOps Tools: MLflow, Weights & Biases, Kubeflow, or BentoML.
- Safety/Eval: Experience with Ne Mo Guardrails, RAGAS, or custom evaluation harnesses.
- Infrastructure: Mastery of Docker, Kubernetes (GPU orchestration), and Azure AI Studio / AWS Sage Maker.
- • Academic Background: Bachelor’s or Master’s degree in Computer Science, AI/ML, or a related technical field.
About Red Hat
Red Hat is the world’s leading provider of enterprise open source software solutions, using a community-powered approach to deliver high-performing Linux, cloud, container, and Kubernetes technologies. Spread across 40 countries, our associates work flexibly across work environments, from in-office, to office-flex, to fully remote, depending on the requirements of their role. Red Hatters are encouraged to bring their best ideas, no matter their title or tenure. We're a leader in open source because of our open and inclusive environment. We hire creative, passionate people ready to contribute their ideas, help solve complex problems, and make an impact.
Inclusion at Red Hat
Red Hat’s culture is built on the open source principles of transparency, collaboration, and inclusion, where the best ideas can come from anywhere and anyone. When this is realized, it empowers people from different backgrounds, perspectives, and experiences to come together to share ideas, challenge the status quo, and drive innovation. Our aspiration is that everyone experiences this culture with equal opportunity and access, and that all voices are not only heard but also celebrated. We hope you will join our celebration, and we welcome and encourage applicants from all the beautiful dimensions that compose our global village.
Equal Opportunity Policy (EEO)
Red Hat is proud to be an equal opportunity workplace and an affirmative action employer. We review applications for employment without regard to their race, color, religion, sex, sexual orientation, gender identity, national origin, ancestry, citizenship, age, veteran status, genetic information, physical or mental disability, medical condition, marital status, or any other basis prohibited by law.
Red Hat does not seek or accept unsolicited resumes or CVs from recruitment agencies. We are not responsible for, and will not pay, any fees, commissions, or any other payment related to unsolicited resumes or CVs except as required in a written contract between Red Hat and the recruitment agency or party requesting payment of a fee.
Red Hat supports individuals with disabilities and provides reasonable accommodations to job applicants. If you need assistance completing our online job application, email application-assistanceredhat.com. General inquiries, such as those regarding the status of a job application, will not receive a reply.
Total Views
0
Apply Clicks
0
Mock Applicants
0
Scraps
0
Similar Jobs

Senior Solutions Architect
Postman · San Francisco, California, United States

Customer Solution Architect (APAC)
Supabase · Remote

Deployment Strategist, Build to Apply - US Government
Palantir · Washington, D.C.

Senior Solutions Architect - Partners
MongoDB · Singapore

Solutions Architect : Data & AI
Databricks · Pune, India
About Red Hat

Red Hat
AcquiredProvides open source software products to enterprises and is a subsidiary of IBM.
10,001+
Employees
Raleigh
Headquarters
Reviews
3.7
2 reviews
Work Life Balance
2.5
Compensation
3.0
Culture
2.0
Career
2.0
Management
2.0
15%
Recommend to a Friend
Cons
Poor communication during hiring process
Excessive interview requirements
Limited remote work flexibility
Salary Ranges
1,281 data points
Junior/L3
Mid/L4
Senior/L5
Junior/L3 · Associate Consultant
70 reports
$103,140
total / year
Base
$95,867
Stock
-
Bonus
$7,273
$67,733
$158,007
Interview Experience
2 interviews
Difficulty
4.0
/ 5
Duration
14-28 weeks
Experience
Positive 0%
Neutral 50%
Negative 50%
Interview Process
1
Screening Test
2
Phone Interview
3
Multiple Interview Rounds
4
Onsite
News & Buzz
IBM earnings beat on AI-driven software demand — but Red Hat slowdown is the catch - TechStock²
Source: TechStock²
News
·
5w ago
RBC Calls International Business Machines Corporation (IBM) a Defensive AI Play as Red Hat Headwinds Ease - Insider Monkey
Source: Insider Monkey
News
·
5w ago
Knitters protest ICE presence in Minnesota with red hats, inspired by another historic act of resistance - Good Good Good News
Source: Good Good Good News
News
·
5w ago
Red Hat acquires Chatterbox Labs to boost AI governance - CIO Dive
Source: CIO Dive
News
·
11w ago