Developing ML and computer vision solutions for cutting-edge autonomous vehicle dataset pipeline at Mobileye. Collaborating across teams for data curation and advanced perception algorithms.
Responsibilities
Work collaboratively with shared ownership. Your focus area will be the curation and ML side of our data pipeline, but you will contribute across the full stack alongside the rest of the team.
Build and improve the curation pipeline - from vision-model embeddings and scene detection, through VLM-based scene analysis, to scoring, deduplication, and sampling that produces a balanced and diverse dataset.
Run and optimize GPU inference at scale (embedding extraction, VLM inference) across thousands of driving sessions using workflow orchestration.
Develop scoring and sampling strategies that ensure rare but important scenarios (night driving, adverse weather, hazardous situations) are well-represented in the final dataset.
Work with algorithm teams to understand what data gaps hurt model performance and translate those into curation criteria.
Build validation and diagnostics that measure dataset quality - not just pipeline health, but whether the data is actually good for training.
Contribute to the core dataset SDK, converter, and 3D-geometry tooling (camera projection, calibration, coordinate transforms).
Requirements
4+ years in ML engineering, applied CV, or a similar role combining model work with production data systems.
Hands-on experience with vision models - embeddings, VLMs, or object detection/segmentation.
Strong Python and comfort with the PyData stack (NumPy, PyArrow, Pandas, DuckDB).
Experience building data or ML pipelines that run at scale (not just notebooks).
Solid understanding of 3D geometry and camera models - or the mathematical background to ramp up quickly.
Good understanding of LLM agents and agentic workflows, with genuine interest in applying them to data and engineering problems.
Ability to work across team boundaries with algorithm and infrastructure people.
Data Engineer/Analyst maintaining and improving data infrastructure for Braiins. Collaborating with technical and business teams to ensure reliable data flows and insights.
Medior Data Engineer handling Azure migrations for a major urban mobility client. Focused on data pipeline development and ensuring platform reliability with cutting - edge technologies.
Data Migration Lead in a hybrid role managing data migration for a major transformation programme in the media sector. Collaborating with various teams to ensure data integrity and successful migration.
Consultant ML & DataOps at Smile integrating data science projects for major clients. Designing MLOps solutions and enhancing data governance in a collaborative environment.
Data Engineer developing and maintaining data pipelines for Coolbet’s analytical services. Working within an Agile framework to ensure data reliability and efficiency.
API Data Engineer developing innovative data - driven solutions and advancing data architecture for AI Control Tower. Building and integrating APIs and data pipelines to support organizational needs.
Journeyman Data Architect supporting Leidos' enterprise data and analytics program for the Department of War. Collaborating on solutions for data architecture, cloud environments, and governance.
Senior Software Engineer developing backend services and data infrastructure for integrated products at Booz Allen. Collaborating with a small elite team to deliver reliable and scalable services.
AWS Streaming Data Engineer developing software and systems in a fast, agile environment. Utilizing experience with real - time data ingestion and processing systems across distributed environments.
Mid - level Data Engineer ensuring efficient data transformation and integration for data annotation projects. Collaborating with teams to optimize data quality and performance in pipeline operations.