채용
필수 스킬
Machine Learning
Computer Vision
NLP
Vision-Language Models
Model Fine-tuning
Distributed Training
PyTorch
Research Scientist – VLM Generalist
Location: Remote
About the Role
We’re looking for a Research Scientist with deep expertise in **training and fine-tuning large Vision-Language and Language Models (VLMs / LLMs)**for downstream multimodal tasks. You’ll help push the next frontier of models that reason across vision, language, and 3D, bridging research breakthroughs with scalable engineering.
What You’ll Do
-
Design and fine-tune large-scale VLMs / LLMs — and hybrid architectures — for tasks such as visual reasoning, retrieval, 3D understanding, and embodied interaction.
-
Build robust, efficient training and evaluation pipelines (data curation, distributed training, mixed precision, scalable fine-tuning).
-
Conduct in-depth analysis of model performance: ablations, bias / robustness checks, and generalisation studies.
-
Collaborate across research, engineering, and 3D / graphics teams to bring models from prototype to production.
-
Publish impactful research and help establish best practices for multimodal model adaptation.
What You Bring
-
PhD (or equivalent experience) in Machine Learning, Computer Vision, NLP, Robotics, or Computer Graphics.
-
Proven track record in fine-tuning or training large-scale VLMs / LLMs for real-world downstream tasks.
-
Strong engineering mindset — you can design, debug, and scale training systems end-to-end.
-
Deep understanding of multimodal alignment and representation learning (vision–language fusion, CLIP-style pre-training, retrieval-augmented generation).
-
Familiarity with recent trends, including video-language and long-context VLMs,spatio-temporal grounding,agentic multimodal reasoning, and Mixture-of-Experts (MoE) fine-tuning.
-
Awareness of 3D-aware multimodal models — using NeRFs, Gaussian splatting, or differentiable renderers for grounded reasoning and 3D scene understanding.
-
Hands-on experience with Py Torch / Deep Speed / Ray and distributed or mixed-precision training.
-
Excellent communication skills and a collaborative mindset.
Bonus / Preferred
-
Experience integrating 3D and graphics pipelines into training workflows (e.g., mesh or point-cloud encoding, differentiable rendering, 3D VLMs).
-
Research or implementation experience with vision-language-action models,world-model-style architectures, or multimodal agents that perceive and act.
-
Familiarity with efficient adaptation methods — LoRA, adapters, QLoRA, parameter-efficient finetuning, and distillation for edge deployment.
-
Knowledge of video and 4D generation trends,latent diffusion / rectified flow methods, or multimodal retrieval and reasoning pipelines.
-
Background in GPU optimisation, quantisation, or model compression for real-time inference.
-
Open-source or publication track record in top-tier ML / CV / NLP venues.
Equal Employment Opportunity:
We are an equal opportunity employer and do not discriminate on the basis of race, religion, national origin, gender, sexual orientation, age, veteran status, disability or other legally protected statuses.
총 조회수
1
총 지원 클릭 수
0
모의 지원자 수
0
스크랩
0
비슷한 채용공고

Data Scientist
Leidos · McLean, VA

Engineering Insights Analyst - Power BI / SQL / Analytics / AI Metrics
Marvell · Santa Clara, CA

Data Scientist, WW Ops, FP&A, WW Ops FP&A
Amazon · Bellevue, WA, USA

Data Scientist - Tiktok Ads, Vertical Solutions
TikTok · San Jose, CA

Data Scientist, Marketing Innovation
OpenAI · San Francisco
Stability AI 소개

Stability AI
Series AStability AI Ltd is a UK-based artificial intelligence company, best known for its text-to-image model Stable Diffusion.
51-200
직원 수
London
본사 위치
$1B
기업 가치
리뷰
3.9
10개 리뷰
워라밸
3.2
보상
4.0
문화
4.1
커리어
3.5
경영진
3.7
72%
친구에게 추천
장점
Flexible working hours
Supportive team and colleagues
Innovative and cutting-edge projects
단점
Heavy and unpredictable workload
Long hours and fast-paced environment
Communication issues
연봉 정보
2개 데이터
Junior/L3
Junior/L3 · Recruiter
0개 리포트
$117,600
총 연봉
기본급
$117,600
주식
-
보너스
-
$99,960
$135,240
면접 경험
41개 면접
난이도
4.2
/ 5
소요 기간
21-35주
합격률
27%
경험
긍정 70%
보통 12%
부정 18%
면접 과정
1
Recruiter Screen
2
ML Coding
3
ML System Design
4
Research Discussion
5
Team Interviews
자주 나오는 질문
ML fundamentals
Design an ML system
Research paper discussion
Statistical concepts
뉴스 & 버즈
Bank of England to test the risk AI poses to country's financial stability - MSN
MSN
News
·
4d ago
Bank of England to test the risk AI poses to country's financial stability - as Governor warns of Anthropic cyber threat - This is Money
This is Money
News
·
4d ago
Finance leaders in Washington issue stark warning on AI cyber threats to financial stability - capacityglobal.com
capacityglobal.com
News
·
4d ago
Anthropic's Mythos AI sparks UK bank cyber stability alarm - SecurityBrief UK
SecurityBrief UK
News
·
4d ago