Senior Research Scientist leading research efforts on reward models for AI. Shaping how models understand and optimize for human preferences with a focus on AI safety and capability.
Responsibilities
Lead research on novel reward model architectures and training approaches for RLHF
Develop and evaluate LLM-based grading and evaluation methods, including rubric-driven approaches that improve consistency and interpretability
Research techniques to detect, characterize, and mitigate reward hacking and specification gaming
Design experiments to understand reward model generalization, robustness, and failure modes
Collaborate with the Finetuning team to translate research insights into improvements for production training pipelines
Contribute to research publications, blog posts, and internal documentation
Mentor other researchers and help build institutional knowledge around reward modeling
Requirements
A track record of research contributions in reward modeling, RLHF, or closely related areas of machine learning
Experience training and evaluating reward models for large language models
Comfortable designing and running large-scale experiments with significant computational resources
Work effectively across research and engineering, iterating quickly while maintaining scientific rigor
Enjoy collaborative research and can communicate complex ideas clearly to diverse audiences
Care deeply about building AI systems that are both highly capable and safe.
Strong candidates may also have published research on reward modeling, preference learning, or RLHF
Experience with LLM-as-judge approaches including calibration and reliability challenges
Worked on reward hacking, specification gaming, or related robustness problems
Experience with constitutional AI, debate, or other scalable oversight approaches
Contributed to production ML systems at scale
Familiarity with interpretability techniques as applied to understanding reward model behavior.
Lead Research Scientist overseeing product development and documentation for liquid and injectable formulations at a global health company. Collaborating on innovative medicines at Apotex in Bangalore.
Senior Research Scientist in GMP laboratory developing high - quality injectable medicines. Seeking motivated individuals to join R&D team committed to improving patient health.
Application Scientist in AQEMIA's Molecular Simulations Team focusing on molecular modelling and drug discovery. Collaborating with diverse teams to guide the development of high - potential drug candidates.
Senior Researcher managing quantitative social surveys at NatCen, liaising with clients and collaborators. Contributing to research design, implementation, and analysis.
Senior Research Scientist developing AI systems for speech and language, collaborating with research teams and mentoring interns. Working on Deep Learning models for NLP and ASR at NVIDIA.
Research Assistant working on an MRC - funded project investigating PFKFB3's role in metabolic dysfunction and liver cancer biology at Newcastle University.
Principal Scientist leading product development for Kraft Heinz’s Canada portfolio. Collaborating with cross - functional teams to commercialize projects in a fast - paced environment.
Data Scientist collaborating with teams to optimize machine learning for advertising. Building and deploying models for campaign performance across e - commerce and social commerce platforms.
Research Scientist specializing in optimization algorithms for electric grids, drones, transportation, and supply chain management. Aiming to impact the scientific community and Mitsubishi Electric's future technology.