
Video understanding AI platform
Research Scientist or Engineer, Video Cognition System
必备技能
AWS
Go
WHO WE ARE:
At Twelve Labs, we are pioneering the development of cutting-edge multimodal foundation models that have the ability to comprehend videos just like humans do. Our models have redefined the standards in video-language modeling, empowering us with more intuitive and far-reaching capabilities, and fundamentally transforming the way we interact with and analyze various forms of media.
With a $110+ million in Seed and Series A funding, our company is backed by top-tier venture capital firms such as NVIDIA’s NVentures, NEA, Radical Ventures, and Index Ventures, and prominent AI visionaries and founders such as Fei-Fei Li, Silvio Savarese, Alexandr Wang and more. Headquartered in San Francisco, with an influential APAC presence in Seoul, our global footprint underscores our commitment to driving worldwide innovation.
Our partnership with NVIDIA and AWS gives us access to the most advanced chips, including B300s, enabling us to push the boundaries of what's possible in video AI.
We are a global company that values the uniqueness of each person’s journey. It is the differences in our cultural, educational, and life experiences that allow us to constantly challenge the status quo. We are looking for individuals who are motivated by our mission and eager to make an impact as we push the bounds of technology to transform the world. Join us as we revolutionize video understanding and multimodal AI.
ABOUT THE TEAM:
The Video Cognition System team is building the world's first video cognition system that transforms raw video archives containing millions of hours of footage into a queryable, structured Video Memory & Cortex accessible to vertical LLM agents.
We're tackling foundational questions about what constitutes machine cognition: perception, memory, reasoning, and attention. We design what and how memory should be structured for machines that go beyond the context window, and build the cortex that can reason across an entire archive of videos.
Our work spans corpus-level reasoning, knowledge extraction, indexing architectures, tool-based operations, multi-video understanding, and agentic workflows. This requires a close partnership between research and engineering, and we care deeply about building systems that are both scientifically rigorous and practically impactful.
Our research team has access to the most advanced chips in the world, including NVIDIA B300s, to push the boundaries of video cognition and understanding systems—accelerating our research-to-production cycle as fast as possible.
ABOUT THE ROLE:
This position spans two tracks—Research Scientist and Research Engineer—determined by your strengths and interests. These roles exist on a spectrum rather than as discrete categories; both contribute to research and implementation.
As a Research Scientist, you will explore fundamental questions in video reasoning, retrieval-guided QA, and multimodal understanding, driving research from hypothesis formulation to experiment design and analysis. You’ll investigate how video agents can interpret long-form content, connect evidence across modalities, and deliver reliable, structured outputs that reflect real user needs.
As a Research Engineer, you will focus on translating research ideas into robust, scalable systems. You’ll build and optimize pipelines that support indexing, retrieval, and agentic workflows, and develop the infrastructure that accelerates experimentation and ensures our models perform reliably in production. Your work will bridge the gap between exploratory research and the systems that power real-world applications.
Both tracks contribute to advancing our video agent capabilities, and both require a balance of curiosity, technical depth, and collaborative problem solving.
YOU MIGHT BE A GREAT FIT IF YOU HAVE:
We’re looking for candidates with research or engineering experience in areas related to video understanding, multimodal retrieval, temporal reasoning, or agent-driven workflows that combine models, tools, and structured search. You should be able to define meaningful research questions, design experiments, and drive projects from ideation to execution while grounding your work in real user scenarios and practical constraints.
Strong proficiency in Python and Py Torch is essential, along with the ability to clearly communicate technical concepts and collaborate across science, engineering, and product teams. Experience in building AI/ML agents and bringing them into production environments is highly preferred. This includes designing agent architectures, integrating them with retrieval or reasoning systems, and deploying them as reliable, user-facing components.
We evaluate based on relevant technical skills and research experience rather than degrees alone, though this is typically supported by an MS/PhD or equivalent practical experience in a relevant field.
WHAT MAKES THIS ROLE UNIQUE:
The team sits at the intersection of models, retrieval, reasoning, and user workflows. The systems you build will rapidly make their way into real products used by customers worldwide. We operate with a tight research-to-production loop, ensuring that innovations have immediate, meaningful impact.
OTHERS
-
Work Location: Seoul Itaewon office + Pangyo satellite office
-
Additional Info: 전문연구요원 편입/전직 가능합니다.
Even if you don't check every box, we encourage you to apply. If you're a zero-to-one achiever, a ferocious learner, and a kind team player who motivates others, you'll find a home at Twelve Labs.
HIRING PROCESS
Application Review → Recruiter Interview (비대면/30분) → Hiring Manager Interview (비대면/30분) → Technical Interview Round 1 (대면/60분) → Technical Interview Round 2 (비대면/90분) → Final Round Interview (비대면/30분) → Reference Check → Offer
浏览量
0
申请点击
0
Mock Apply
0
收藏
0
相似职位

Werkstudent (m/w/d) – Digitale Analyse von CT-Scans
Stryker · Freiburg, Germany

Staff, Data Scientist(SCM)
Coupang · Seoul, South Korea

Field Application Engineer, Battery Scientist (MS, PhD신입가능)
Thermo Fisher · Gangnam-gu, Korea, Republic of

Senior Data Scientist, Korea Content
Netflix · Seoul,Korea, Republic of

Staff Data Scientist (SCM Systems)
Coupang · Seoul, South Korea
关于Twelve Labs

Twelve Labs
Series AIntel Capital Corporation started off as the investment arm of Intel Corporation in 1991 and in January 2025, it spun off as a standalone investment fund.
51-200
员工数
San Francisco
总部位置
评价
10条评价
3.8
10条评价
工作生活平衡
4.2
薪酬
2.8
企业文化
4.0
职业发展
3.2
管理层
3.5
65%
推荐率
优点
Great work-life balance
Supportive team and environment
Good company culture and friendly coworkers
缺点
Compensation/pay not competitive
Limited career advancement opportunities
Poor management and lack of direction
薪资范围
5个数据点
Senior/L5
Intern
Senior/L5 · MACHINE LEARNING ENGINEER
1份报告
$318,500
年薪总额
基本工资
$245,000
股票
-
奖金
-
$318,500
$318,500
最新动态
Local model for video annotation on Mac Mini
I am looking for a local model to annotate terabytes of video on a Mac Mini. It should ideally provide timestamped descriptions of the scene, similar to Twelve Labs or Descript. The annotation can be done slowly. I will leave the Mac running 24/7 for weeks/months to get this done. Any thoughts on models to use?
·
1w ago
·
1
·
1
Top 5 Products from Yesterday on Product Hunt
Hey everyone 👋 Check out these 5 awesome products that stood out yesterday on Product Hunt! A nice mix of AI, productivity, and innovation: **1. Dune** Context-aware Mac keypad that automates workflows + meetings. Awesome for streamlining tasks. **2. Claude Desktop Buddy** Bringing Claude into the physical world with maker hardware. A unique blend of AI and hardware for personal productivity. **3. The New Waydev** Measure the full AI SDLC. From token to production – perfect for developer
·
2w ago
·
13
·
5
Full-Feature ElevenLabs MCP - TwelveLabs!
**NOTE: This is NOT related to the TwelveLabs video platform**. This is purely for ElevenLabs Conversational AI. -- I thought I was clever coming up with the name until I saw that 🫠 The official ElevenLabs MCP connector is great for TTS/voice cloning, but it doesn't cover the Conversational AI API at all — no agent config access, no conversation transcripts, no knowledge base management, etc. **Official ElevenLabs MCP:** * Create agent ✅ * Get agent details ✅ * List agents ✅ * Get conve
·
4w ago
·
5
Machine Learning Engineer, 6+ Years Experience at Twelve Labs San Francisco, CA
·
4w ago
·
1