refresh

トレンド企業

トレンド企業

採用

求人Stability AI

Research Scientist – VLM Generalist

Stability AI

Research Scientist – VLM Generalist

Stability AI

Remote

·

Remote

·

Full-time

·

1mo ago

必須スキル

Machine Learning

Computer Vision

NLP

Vision-Language Models

Model Fine-tuning

Distributed Training

PyTorch

Research Scientist – VLM Generalist

Location: Remote

About the Role

We’re looking for a Research Scientist with deep expertise in **training and fine-tuning large Vision-Language and Language Models (VLMs / LLMs)**for downstream multimodal tasks. You’ll help push the next frontier of models that reason across vision, language, and 3D, bridging research breakthroughs with scalable engineering.

What You’ll Do

  • Design and fine-tune large-scale VLMs / LLMs — and hybrid architectures — for tasks such as visual reasoning, retrieval, 3D understanding, and embodied interaction.

  • Build robust, efficient training and evaluation pipelines (data curation, distributed training, mixed precision, scalable fine-tuning).

  • Conduct in-depth analysis of model performance: ablations, bias / robustness checks, and generalisation studies.

  • Collaborate across research, engineering, and 3D / graphics teams to bring models from prototype to production.

  • Publish impactful research and help establish best practices for multimodal model adaptation.

What You Bring

  • PhD (or equivalent experience) in Machine Learning, Computer Vision, NLP, Robotics, or Computer Graphics.

  • Proven track record in fine-tuning or training large-scale VLMs / LLMs for real-world downstream tasks.

  • Strong engineering mindset — you can design, debug, and scale training systems end-to-end.

  • Deep understanding of multimodal alignment and representation learning (vision–language fusion, CLIP-style pre-training, retrieval-augmented generation).

  • Familiarity with recent trends, including video-language and long-context VLMs,spatio-temporal grounding,agentic multimodal reasoning, and Mixture-of-Experts (MoE) fine-tuning.

  • Awareness of 3D-aware multimodal models — using NeRFs, Gaussian splatting, or differentiable renderers for grounded reasoning and 3D scene understanding.

  • Hands-on experience with Py Torch / Deep Speed / Ray and distributed or mixed-precision training.

  • Excellent communication skills and a collaborative mindset.

Bonus / Preferred

  • Experience integrating 3D and graphics pipelines into training workflows (e.g., mesh or point-cloud encoding, differentiable rendering, 3D VLMs).

  • Research or implementation experience with vision-language-action models,world-model-style architectures, or multimodal agents that perceive and act.

  • Familiarity with efficient adaptation methods — LoRA, adapters, QLoRA, parameter-efficient finetuning, and distillation for edge deployment.

  • Knowledge of video and 4D generation trends,latent diffusion / rectified flow methods, or multimodal retrieval and reasoning pipelines.

  • Background in GPU optimisation, quantisation, or model compression for real-time inference.

  • Open-source or publication track record in top-tier ML / CV / NLP venues.

Equal Employment Opportunity:

We are an equal opportunity employer and do not discriminate on the basis of race, religion, national origin, gender, sexual orientation, age, veteran status, disability or other legally protected statuses.

総閲覧数

1

応募クリック数

0

模擬応募者数

0

スクラップ

0

Stability AIについて

Stability AI

Stability AI

Series A

Stability AI Ltd is a UK-based artificial intelligence company, best known for its text-to-image model Stable Diffusion.

51-200

従業員数

London

本社所在地

$1B

企業価値

レビュー

3.9

10件のレビュー

ワークライフバランス

3.2

報酬

4.0

企業文化

4.1

キャリア

3.5

経営陣

3.7

72%

友人に勧める

良い点

Flexible working hours

Supportive team and colleagues

Innovative and cutting-edge projects

改善点

Heavy and unpredictable workload

Long hours and fast-paced environment

Communication issues

給与レンジ

2件のデータ

Junior/L3

Junior/L3 · Recruiter

0件のレポート

$117,600

年収総額

基本給

$117,600

ストック

-

ボーナス

-

$99,960

$135,240

面接体験

41件の面接

難易度

4.2

/ 5

期間

21-35週間

内定率

27%

体験

ポジティブ 70%

普通 12%

ネガティブ 18%

面接プロセス

1

Recruiter Screen

2

ML Coding

3

ML System Design

4

Research Discussion

5

Team Interviews

よくある質問

ML fundamentals

Design an ML system

Research paper discussion

Statistical concepts