採用
Who we are
At Twelve Labs, we are pioneering the development of cutting-edge multimodal foundation models that have the ability to comprehend videos just like humans do. Our models have redefined the standards in video-language modeling, empowering us with more intuitive and far-reaching capabilities, and fundamentally transforming the way we interact with and analyze various forms of media.
With a $110+ million in Seed and Series A funding, our company is backed by top-tier venture capital firms such as NVIDIA’s NVentures, NEA, Radical Ventures, and Index Ventures, and prominent AI visionaries and founders such as Fei-Fei Li, Silvio Savarese, Alexandr Wang and more. Headquartered in San Francisco, with an influential APAC presence in Seoul, our global footprint underscores our commitment to driving worldwide innovation.
Our partnership with NVIDIA and AWS gives us access to the most advanced chips, including B300s, enabling us to push the boundaries of what's possible in video AI.
We are a global company that values the uniqueness of each person’s journey. It is the differences in our cultural, educational, and life experiences that allow us to constantly challenge the status quo. We are looking for individuals who are motivated by our mission and eager to make an impact as we push the bounds of technology to transform the world. Join us as we revolutionize video understanding and multimodal AI.
About the Team
The Pegasus team sits at the core of Twelve Labs’ video understanding capabilities and is responsible for driving Pegasus, our Video Analysis product. Our focus is on developing multimodal video analysis systems that are designed for high instruction following capability and producing highly complex, hierarchically structured outputs. We focus on shipping products with real-world value rather than doing research in isolation, and we work in a goal-oriented, cross-functional team that encompasses both ML researchers and engineers.
Our work covers a broad range of challenges: large-scale distributed training of multi-modal LLMs that span from pre-training to RL, accurate temporal segmentation and structured metadata extraction for real-world use cases, extending temporal context length to multiple hours, and data curation processes that enable well-aligned evaluation and performance improvements through training data enhancements.
Our team has access to the most advanced chips in the world, including NVIDIA B300s, to push the boundaries of video analysis systems—accelerating our research-to-production cycle as fast as possible.
This role may be leveled as ML Research Scientist, Senior ML Research Scientist, or Staff ML Research Scientist depending on a candidate’s research depth, technical scope, and track record of impact.
In this role, you will
-
Define and drive research problems that advance Pegasus’s video analysis capabilities, from hypothesis formulation through experimentation and iteration.
-
Design and run rigorous experiments across model architecture, training strategy, data curation, and evaluation to improve the quality of our multimodal systems.
-
Build evaluation methods and data curation processes that translate real-world use cases into reliable research signals and measurable model improvements.
-
Work closely with ML Engineers to turn research outcomes into robust systems with real product impact.
-
Communicate research findings clearly and use them to inform technical direction across the team.
Even if you don't check every box, we encourage you to apply.
If you're a zero-to-one achiever, a ferocious learner, and a kind team player who motivates others, you'll find a home at Twelve Labs.
You may be a good fit if you have
-
Strong research experience in one or more relevant areas such as multimodal or unimodal LLMs, large-scale distributed training, data-centric model development, computer vision, or vision-language modeling.
-
A track record of independently driving research from ideation to execution, demonstrated through projects, technical contributions, or research outputs.
-
Strong proficiency in Python and Py Torch.
-
Strong experimental judgment, including the ability to design evaluations, run rigorous ablations, and draw clear conclusions from empirical results.
-
The ability to communicate effectively and collaborate closely with both researchers and engineers.
Preferred qualifications
-
Experience working on multimodal systems involving video, vision, language, or structured output generation.
-
Experience improving model quality through data curation, evaluation design, or training data enhancements.
-
Experience with large-scale distributed training in high-performance GPU environments.
-
Experience translating research advances into production ML systems.
-
MS, PhD, or equivalent practical experience in Machine Learning, Computer Science, or a related technical field.
Others
- Work Location:
Seoul Itaewon office + Pangyo satellite office
- Additional Info:
전문연구요원 편입/전직 가능합니다.
Hiring Process
Application Review → Recruiter Interview (비대면/30분) → Loop Interview Hiring Manager Interview&Live Coding Test Interview (대면/약 90분) → Loop Interview System Design&Final Round Interview (대면/약 120분) → Reference Check → Offer
Benefits and Perks
-
글로벌 B2B 고객과 함께 성장하는 Global Team
-
자율성과 협업을 모두 갖춘 하이브리드 근무
-
전 직원에게 맥북 및 70만 원 상당 재택근무 장비 지원, 3년 주기로 최신 장비 교체
-
식사·교통비 등 자유롭게 사용할 수 있는 월 60만 원 한도 법인카드 제공
-
사무실 내 스낵바( 간식, 커피, 신선식품 제공)
-
연말 2주간 겨울방학 운영
-
연 1회 건강검진 지원
-
영어교육 프로그램 지원
総閲覧数
1
応募クリック数
0
模擬応募者数
0
スクラップ
0
類似の求人

Machine Learning Researcher (NLP)
Boeing · seoul

ML Researcher (Computer Vision – Vision Language Model)
Boeing · seoul

Machine Learning Research Engineer – Speech for On-Device Agentic AI
Qualcomm · Seoul, Korea, Republic of

Safety Labeling Quality Assurance Operator - AI Data Service Operations - Persons with Disability (PwD)
TikTok · Seoul, South Korea

AI Researcher: On-Device Agentic AI
Qualcomm · Seoul, Korea, Republic of
Twelve Labsについて

Twelve Labs
Series ATwelve Labs is an AI company that develops video understanding technology using multimodal foundation models. The company provides APIs and tools for developers to build applications that can search, analyze, and generate insights from video content.
51-200
従業員数
San Francisco
本社所在地
レビュー
3.8
10件のレビュー
ワークライフバランス
4.2
報酬
2.5
企業文化
4.0
キャリア
2.8
経営陣
3.2
65%
友人に勧める
良い点
Good work-life balance
Supportive team and environment
Friendly coworkers and team spirit
改善点
Poor compensation/pay not competitive
Limited career advancement opportunities
Poor management and lack of direction
給与レンジ
5件のデータ
Senior/L5
Intern
Senior/L5 · Machine Learning Engineer
1件のレポート
$318,500
年収総額
基本給
$245,000
ストック
-
ボーナス
-
$318,500
$318,500
ニュース&話題
Twelve Labs announced on the 1st that it has built an AI archive that allows users to quickly search.. - 매일경제
매일경제
News
·
2w ago
Tried a bunch of “popular” AI tools for organizing recordings… some hot takes
I’ve been cleaning up a few months’ worth of recordings and video clips lately (meetings, random notes, saved content, etc.), so I figured I’d finally try some of the AI tools everyone keeps recommending. Still wanna pick one tool to be my go-to tbh. Just wanna say upfront, this is purely my personal experience. Not saying any of these are bad, just what worked / didn’t work for me.(no affiliate links, just sharing my feeling) - Otter.AI Probably the most well-known one. Transcription is solid
·
3w ago
·
3
·
5
How do you actually go back through meeting recordings without it taking forever?
Been in grad school long enough that lab meetings have become their own kind of stress. My PI throws out ideas mid-sentence, keeps going, and I'm nodding, then I'm back at my desk and realize I've retained maybe half of it. Started recording everything a while back. (of course, with everyone’s consent before recording) It helped, but reviewing became its own problem. I'd scrub through an hour of audio trying to find one 15-second comment. Been trying a few different AI tools for this over the
·
3w ago
·
4
·
4
Twelve Labs introduces video AI search on Gettyimagebank - 디지털투데이
디지털투데이
News
·
4w ago