Data Scientist designing evaluation metrics and pipelines to enhance answer quality for Perplexity's AI products. Collaborating in a high-impact team using advanced machine learning methodologies.
Responsibilities
Architect and maintain automated evaluation pipelines to assess answer quality across Perplexity's products, ensuring high standards for accuracy and helpfulness
Design evaluation sets and methods specifically to measure the impact of tool calls (particularly web search retrieval) on the final answer's quality
Develop VLM-based solutions to programmatically evaluate how final answers render visually across different platforms and devices
Continuously review public benchmarks and academic evaluations for their applicability to the Perplexity product, adapting and incorporating them into our regular performance measurements
Operate within a small, high-impact team where your evaluation metrics directly shape product changes, collaborating closely with technical leadership to measure and improve Answer Quality
Requirements
PhD or MS in a technical field or equivalent experience
4+ years of experience in data science or machine learning
Strong proficiency in Python and SQL (expected to write production-grade code)
Experience building within a modern cloud data stack, specifically AWS and Databricks
Comfortable with agentic coding workflows and using AI-assisted development tools to iterate faster
1+ years of experience working with LLMs at scale, specifically with LLM-as-a-judge setups (preferred)
Prior experience working on customer-facing web products or consumer apps, with real user traffic at scale (preferred)
A strong research background, with experience applying research methods to real-world ML problems (preferred)
Experience defining evaluation metrics (e.g., factual consistency, hallucination rate, retrieval precision) and building ground truth datasets (preferred)
Benefits
U.S. Benefits
Full-time U.S. employees enjoy a comprehensive benefits program including equity, health, dental, vision, retirement, fitness, commuter and dependent care accounts, and more.
International Benefits
Full-time employees outside the U.S. enjoy a comprehensive benefits program tailored to their region of residence.
Data Scientist developing data pipelines and machine learning algorithms in an AI - focused team. Collaborating with engineers to drive data - driven insights in Guadalajara, Jalisco, Mexico.
Lead product marketing data strategy, ensuring quality data flows across all channels and supporting the Group's marketing performance on an international scale.
Data Scientist Intern supporting data analysis and machine learning in healthcare SaaS environment. Collaborating with experienced data scientists to enhance operational efficiency.
Data Scientist responsible for analytical insights in Lending Analytics and Credit Risk. Utilizing machine learning and data modeling to improve organizational impact while managing credit risk.
Data Scientist for Product Sustainability at Roche managing environmental data for sustainability reporting. Bridging Global Sustainability Experts and IT systems to ensure actionable insights for product emissions and compliance.
Senior Data Scientist at Roche leading data science projects for healthcare analytics. Collaborating with cross - functional teams to drive strategic decision - making and optimize business outcomes.
Staff Data Scientist guiding the implementation of machine learning models and deep learning tools at Blue Yonder. Collaborating with cross - functional teams to deliver retail solutions for data science and ML applications.
Data Scientist I developing AI solutions for a leading supply chain company. Innovating and collaborating on cutting - edge AI technologies within diverse teams.
Insights & Analytics Lead managing consumer insights and research projects at Kimberly - Clark. Leading strategic initiatives to drive value creation across Asia's markets.
Lead consumer insights projects and analyses across Asia for Kimberly - Clark’s Family Care brands. Drive strategies through deep understanding of consumer needs and market trends.