採用

Research Scientist, Science of Post-Training and Reinforcement Learning
London, UK
·
On-site
·
Full-time
·
6d ago
Snapshot
We are starting a small team aimed at building a real science of post-training for agents. This involves reinforcement learning for LLM-based systems, rigorous experimentation, and a focus on scaling, evaluation, and the practical details that make methods work.
This Research Scientist role is intentionally hands-on. The core loop is: form a hypothesis, implement it, run strong experiments, analyze what happened, and decide what to do next. We care about research that holds up over time, not just incremental wins.
About Us
Artificial Intelligence could be one of humanity’s most useful inventions. At Google Deep Mind, we’re a team of scientists, engineers, machine learning experts and more, working together to advance the state of the art in artificial intelligence. We use our technologies for widespread public benefit and scientific discovery, and collaborate with others on critical challenges, ensuring safety and ethics are the highest priority.
The Role
You will work closely with Ian Osband and the team on research around post-training for agents and LLMs, including practical RL methods and evaluation. This is not a theory-only role; you should expect to implement code, run experiments, and own results end-to-end. Success in this role is defined by whether the team learns faster and whether the work produced is crisp, honest, and high-quality.
Key Responsibilities
Propose and test research hypotheses in post-training and RL for agents/LLMs.
Implement algorithm ideas and run end-to-end experiments, including setup, execution, analysis, and iteration.
Design evaluations and ablations that answer real questions and change minds.
Analyze results carefully, including debugging and failure analysis.
Communicate clearly through plots, writeups, and paper-ready narratives and figures.
Collaborate closely with engineering and research partners to keep the team aligned on findings and strategy.
Contribute to a culture of first-principles thinking, high standards, and direct, constructive feedback.
About You
In order to set you up for success as a Research Scientist at Google Deep Mind, we look for the following skills and experience:
A research track record in ML/RL, demonstrated through publications or high-quality projects.
Strong implementation ability and comfort working in research codebases.
Evidence of owning experiments end-to-end, including analysis and interpretation.
Strong communication skills and a bias toward clarity and honesty regarding results.
High agency and drive: You push projects forward, prioritize effectively, and take initiative.
PhD in ML preferred, or equivalent practical experience.
In addition, the following would be an advantage:
Experience with RL for sequence models, post-training, preference-based learning, or agentic systems.
Experience with modern research stacks (e.g., JAX/Flax or Py Torch) and scaling experiments.
Strong experimental taste: Good judgment regarding baselines, ablations, and what is worth testing.
Comfort with scaling, evaluation methodologies, and diagnosing complex failure modes.
A focus on craft: You care about doing excellent work while maintaining a high velocity.
At Google Deep Mind, we value diversity of experience, knowledge, backgrounds and perspectives and harness these qualities to create extraordinary impact. We are committed to equal employment opportunities regardless of sex, race, religion or belief, ethnic or national origin, disability, age, citizenship, marital, domestic or civil partnership status, sexual orientation, gender identity, pregnancy, or related condition (including breastfeeding) or any other basis as protected by applicable law. If you have a disability or additional need that requires accommodation, please do not hesitate to let us know.
Note: In the event your application is successful and an offer of employment is made to you, any offer of employment will be conditional on the results of a background check, performed by a third party acting on our behalf. For more information on how we handle your data, please see our Applicant and Candidate Privacy Policy.
Closing date: Tuesday, 17th March at 5:00pm GMT
Total Views
0
Apply Clicks
0
Mock Applicants
0
Scraps
0
Similar Jobs
GO
Senior Data Scientist, Fraud Prevention
GoCardless · London, UK

Data Scientist II, RufusX Science UK
Amazon · London, GBR

Senior Research Scientist - Reinforcement Learning, MoEs
Canva · London

Data Scientist, Business
OpenAI · San Francisco

Staff Product Data Scientist, Operation Center
Waymo · Mountain View, CA, US; San Francisco, CA, US
About Google DeepMind

Google DeepMind
AcquiredDeepMind Technologies Limited, trading as Google DeepMind or simply DeepMind, is a British-American artificial intelligence research laboratory which serves as a subsidiary of Alphabet Inc.
1,001-5,000
Employees
London
Headquarters
Reviews
3.8
10 reviews
Work Life Balance
3.8
Compensation
4.2
Culture
3.5
Career
4.0
Management
2.8
68%
Recommend to a Friend
Pros
Smart and brilliant colleagues
Good compensation and benefits
Work flexibility and remote options
Cons
Poor management and leadership issues
Bureaucracy and slow processes
Constantly changing priorities and goals
Interview Experience
5 interviews
Difficulty
3.0
/ 5
Duration
21-35 weeks
Offer Rate
60%
Experience
Positive 60%
Neutral 40%
Negative 0%
Interview Process
1
Application Review
2
Phone Screen/Online Assessment
3
Technical Interview
4
Team Matching Interview
5
Offer
Common Questions
Coding/Algorithm
Technical Knowledge
Behavioral/STAR
Research Experience
System Design
News & Buzz
Google Deepmind pioneer David Silver departs to found AI startup, betting LLMs alone won't reach superintelligence - the-decoder.com
Source: the-decoder.com
News
·
5w ago
Apple loses more AI researchers, Siri exec to Google and Meta - 9to5Mac
Source: 9to5Mac
News
·
5w ago
Apple Loses More AI Researchers and a Siri Executive in Latest Departures - Bloomberg
Source: Bloomberg
News
·
5w ago
Google DeepMind seeks team lead for growing AI chip design effort - Data Center Dynamics
Source: Data Center Dynamics
News
·
5w ago