Senior Data Engineer responsible for building high-performance data pipelines for satellite analytics. Collaborating with ML Engineers and product teams to enable actionable insights from satellite data.
Responsibilities
Build Scalable Data Pipelines: Design and maintain robust ETL/ELT workflows using Prefect and Ray to ingest, process, and standardize massive volumes of satellite imagery.
EO Data Management: Own the standardization of high-resolution SAR and optical imagery, focusing on normalization, tiling/chipping, and co-registration sanity checks to ensure data integrity.
Infrastructure & Tooling: Optimize our cloud-native stack on AWS, leveraging Databricks and PostgreSQL to manage metadata and operational data stores.
Collaborative AI Support: Partner closely with ML Engineers to deliver production-ready data components and inference interfaces that downstream teams can depend on.
Data Quality & Diagnostics: Work hand-in-hand with the data annotation team to automate feedback loops on data quality and ensure datasets reflect real-world edge cases.
System Reliability: Implement monitoring signals and deterministic evaluation frameworks to ensure pipeline reproducibility across various geographies and acquisition conditions.
Requirements
Strong Software Engineering: Mastery of Python with a focus on clean, maintainable, and testable code.
Data Orchestration & Compute: Proficiency in using Prefect (or Airflow) and distributed computing frameworks like Ray or Anyscale.
Cloud & Big Data: Deep expertise in AWS infrastructure and Databricks for large-scale data processing.
Database Management: Strong knowledge of PostgreSQL and managing complex metadata at scale.
Pragmatic Delivery: A mindset that balances building robust, long-term infrastructure with the need for practical, iterative delivery.
Geospatial Stack: Experience with GDAL, Rasterio, GeoPandas, and STAC for handling Earth Observation data is a plus.
ML Integration: Familiarity with PyTorch Lightning and MLflow to better support the ML R&D lifecycle is a plus.
SAR Experience: Basic knowledge of SAR preprocessing libraries and data formats is a plus.
Benefits
The opportunity to create a product that can improve business processes and lives across the globe.
Flexible working hours and hybrid work model - we trust our employees to get their work done while maintaining a healthy work-life balance.
We empower employees to drive their own career development, take initiative and have the freedom to be creative and bold.
Not an overtime culture - we take care that overtime is done only as a necessity and always offset with time off and rest.
A collaborative and learning environment - frequent internal workshops, knowledge sharing sessions, journal clubs and hackathons.
Office located in the centre of Berlin Kreuzberg with free fruit, nuts and drinks.
Potential to participate in the employee stock option program.
Urban Sports membership and BVG subsidy, corporate pension program.
A diverse and vibrant international environment of 30+ different nationalities.
Job title
Senior Data Engineer – Remote Sensing, AI Pipelines
Data Engineer focused on analytics and data pipeline development for network optimisation. Collaborating with teams to deliver high - quality data solutions with Python and SQL.
Senior Product Manager defining platform capabilities for Data Cloud in Salesforce. Collaborating with R&D teams while shaping product strategy for Data 360 integration.
Senior Data Engineer at Goodwin enhancing data platforms and fostering data - driven culture across teams. Collaborating with IT and Finance on technology solutions and data governance practices.
Director, Data Platform Design and Strategy at MedImpact leading data platform and AI innovations to enhance healthcare services. Overseeing enterprise projects and managing teams to meet strategic goals.
Data Engineer delivering AI - and data - driven solutions for Honeywell’s industrial customers. Architecting and implementing scalable data pipelines and platforms focused on IoT and real - time data processing.
Data Engineering Associate focusing on data quality control and management for distribution platform. Collaborates on large scale data projects to ensure data accuracy and availability for users.
Data Architect managing enterprise data platform built on Microsoft Fabric at Johnstone Supply. Leading architectural standards and collaborating with business and IT leaders for strategic data - driven insights.
Data Engineer at Studyportals responsible for data pipelines and infrastructure. Join a team ensuring accurate and trustworthy data for analytics and business decisions.
AI/ML Engineer designing and refining prompts and workflows using large language models. Responsible for developing data pipelines and delivering scalable AI solutions in a hybrid work environment.
AWS Data Architect at Fractal designing and operationalizing AWS data solutions at enterprise scale. Collaborating with clients and mentoring engineers in best practices.