Data Scientist creating scalable insights from unstructured data at AI safety company. Collaborating with engineering and research teams in a hybrid Paris location.
Responsibilities
Turn petabytes of unstructured text into a structured, explorable view (topics, clusters, segments, trends, anomalies): iterate from “unknown unknowns” to stable definitions we can track.
Build scalable representation pipelines: sampling strategies, preprocessing/normalization, embeddings at scale, indexing, and retrieval to make the corpus searchable and analyzable.
Use LLMs pragmatically: labeling/classification, weak supervision, data enrichment, summarization, and automated diagnostics of inbound volumes (with cost/quality controls).
Deliver insights that change decisions: translate findings into product and operational actions (what data we have, what’s missing, where quality breaks, what to prioritize next).
Ship self-serve analytics: datasets, data models, and lightweight tools/dashboards so the team can explore and answer questions without ad-hoc requests.
Partner closely with engineering/research: align pipelines with production constraints (latency/cost/privacy), and integrate outputs into workflows.
Requirements
Strong Python + SQL with an engineering mindset: you can build reliable pipelines, not just notebooks.
Solid applied NLP/ML experience on real-world text: embeddings, clustering, topic modeling, semantic search, classification; you understand failure modes and how to debug them.
Comfortable at scale: distributed processing, large-scale storage-querying, and performance-cost tradeoffs.
You know how to evaluate fuzzy problems: offline/online metrics, human-in-the-loop labelling, inter-annotator agreement, drift monitoring, and reproducibility.
Prior work with safety/moderation datasets, policy/rule systems, or high-volume logging/observability
Benefits
20 days of paid vacation
Work from Paris (hybrid) + relocation package
Best medical insurance in France
All the hardware, tools, and services you need
Covered subscriptions for AI agents and IDEs
Team off-sites twice a year: we’ve recently been to the Alps and to Saint-Tropez
Analytics Engineer for Customer Product Analytics team at Just Eat Takeaway.com. Enhancing user experience by delivering data - driven insights and optimizing product experience in a hybrid role.
Analytics Engineer transforming raw data into organized datasets. Collaborating with business teams to ensure data quality and governance in Azure environment.
Data Engineer creating scalable data pipelines for Capgemini's analytics solutions using modern ETL tools. Involves data storage and management using platforms like Snowflake and Redshift.
Analytics Engineer blending data analysis, business intelligence, and data engineering for a healthcare software company. Creating dashboards and data models to empower decision - making and improve healthcare outcomes.
Lead Web Analytics Developer architecture and implementation of web analytics tools at Hostinger. Collaborate with teams to enhance user experience and improve conversion rates.
Principal managing analytics engineering projects for private equity, leveraging AI technology. Leading cross - functional teams and contributing to business development in a hybrid work environment.
Prototyping analytical tools developed by the engineering and analytics teams for Clir Renewables. Collaborating on automation scripts and methods for renewable energy analysts.
Analytics Engineer Sr developing data pipelines and analytics solutions for Brazil's retail industry. Collaborating with marketing and product teams to optimize data usage and performance.
Senior SAP Data Solutions Engineer designing and implementing data solutions in SAP environments. Contributing to SAP migrations and collaborating with cross - functional teams for project success.
Analytics Engineer creating data products that meet various business needs for the fintech company Stone. Responsible for data pipelines, monitoring, and ensuring data quality.