Senior Research Scientist creating evaluation methods and benchmarks for LLMs at Cohere. Working with cross-functional teams to advance AI capabilities and model performance evaluation.
Responsibilities
Create ambitious new evaluation benchmarks that push the limits of what our models can accomplish.
Work on highly cross-functional teams to translate model feedback into trustworthy, repeatable evaluations.
Conduct research to advance the state-of-the-art in LLM evaluation methods, including training LLM judges; refining LLM-based data synthesis pipelines; and improving evaluation efficiency.
Build scalable and reusable tools for digging into model performance.
Requirements
You enjoy rapidly building prototypes that demonstrate the boundaries of what LLMs are capable of, and you have developed resources to measure those capabilities.
You have spent dozens of hours reviewing complex data and LLM outputs to ensure high data quality.
You are obsessive about rigorously measuring AI capabilities, and also about making sure your measurements actually align with the capabilities you care about.
You have strong software engineering skills.
Benefits
An open and inclusive culture and work environment
Work closely with a team on the cutting edge of AI research
Weekly lunch stipend, in-office lunches & snacks
Full health and dental benefits, including a separate budget to take care of your mental health
100% Parental Leave top-up for 6 months for employees based in Canada, the US, and the UK
Personal enrichment benefits towards arts and culture, fitness and well-being, quality time, and workspace improvement
Remote-flexible, offices in Toronto, New York, San Francisco, London and Paris, as well as a co-working stipend
Undergraduate research assistant conducting studies on the ecology of insects in agricultural systems at Penn State. Collaborate with graduate students and researchers on pest management strategies and data analysis.
Principal Scientist managing analytical projects for Hikma Pharmaceuticals focusing on compliance and validation in pharmaceutical development. Overseeing technical projects and mentoring team members.
Research Scientist enhancing Spotify's personalization using advanced AI technologies in London. Join a team focused on innovative user experiences and cutting - edge research.
Research Assistant performing research activities at Boston Medical Center. Engaging in patient recruitment, data management, and assisting in grant preparation.
Scientist developing high - throughput workflows for novel protein characterization in generative AI and biology. Collaborating and innovating in a cutting - edge biotech atmosphere with a focus on positive global impact.
Staff Research Scientist driving AI innovation and LLM development at Snowflake. Defining research direction and developing cutting - edge models for enterprise AI collaboration.
Research Scientist contributing to quantum algorithms development at Quantinuum. Engaging in research, publishing papers, and collaborating with industry partners in the field of quantum computing.
Research Scientist at Snorkel AI bridging research breakthroughs and practical applications for AI solutions. Working in a dynamic environment to prototype and deploy innovative AI systems.
Principal Scientist leading complex R&D projects to bring new food products to market at Rich Products. Collaborating across teams and driving innovation in food science.