Principal Data Engineer at Trainline shaping robust data foundations for AI and ML-driven products. Collaborate with cross-functional teams to ensure best practices in data engineering and ML.
Responsibilities
Act as a technical authority across multiple teams, setting standards and patterns for data and ML‑adjacent infrastructure
Embed with ML teams to design, build and evolve data platforms supporting AI and ML workloads
Influence technical direction without direct line management responsibility
Partner with Data Engineering teams outside of ML to build a community and share best practices and findings across all areas
Identify systemic issues and proactively drive improvements across the data ecosystem
Look for short term and strategic opportunities to enhance core platforms with new self-serve enablement features for ML and DE
Partnering with MLEs to design data pipelines supporting model training, inference and experimentation
Designing and reviewing architectures for ML‑ready data platforms
Building and optimising data pipelines using SQL, Spark or Ray and Python
Defining best practices for orchestration using Airflow or similar tools
Supporting API‑driven and event‑based data access patterns
Working with AWS infrastructure such as ECS, vector databases and Bedrock APIs
Reviewing designs and code across teams to raise quality and consistency
Coaching engineers through pairing, design reviews and informal mentoring
Collaborating on innovative AI‑powered product features such as the Travel Assistant
Requirements
Extensive experience as a Senior, Staff or Principal Data Engineer operating across teams
Deep expertise in SQL and Python, with strong experience in Spark or similar tooling.
Strong understanding of orchestration tools such as Apache Airflow
Experience designing data platforms for ML and AI workloads
A track record of introducing new technologies and practices, and handling ambiguity and multiple stakeholders.
Hands‑on experience with AWS infrastructure (e.g. ECS, IAM, data storage, compute)
Familiarity with vector databases and modern AI/ML APIs (e.g. Bedrock)
Experience working closely with Machine Learning Engineers in production environments
Strong system design skills and the ability to influence through technical leadership
Software Developer in Test working on cloud - based data platform at Tecsys. Ensuring quality and reliability of data pipelines and transformations using automation frameworks.
Data Engineer responsible for designing, building, and optimizing data pipelines and architectures in a tech environment. Requires extensive experience with modern data warehousing and cloud platforms.
Lead Data Engineer role at Brillio focusing on AI & Data Engineering with expertise in Azure and MS Fabric. Collaborate within the Data Engineering team in Pune, Maharashtra, India.
Data Architect at Whiteshield designing scalable, secure data architectures for national and enterprise transformation programs. Architecting modern data platforms to support analytics, AI and operational use cases.
Data Engineer managing scalable data ecosystems for actionable business intelligence and cross - functional stakeholder collaboration. Optimizing ETL/ELT pipelines and ensuring data integrity and security.
Data Engineer specializing in data architecture and solutions for a banking environment, driving value for customers through innovative engineering practices and technologies in data management.
Technical Lead for data engineering and reporting in healthcare technology at Dedalus. Shaping innovative software solutions and leading cross - functional technical teams in Australia.
Senior ML Data Engineer working on data pipeline curation for Mobileye's autonomous vehicle dataset. Collaborating across teams to enhance ML engineering and vision model applications.
Data Engineer managing customer datasets to enhance industrial research and development. Responsible for ETL pipelines and data ingestion for the Uncountable Web Platform.
Data Engineer designing and maintaining scalable data solutions on Databricks for clinical trials. Collaborating with teams to overcome data challenges and ensure the smooth logistics of clinical supplies.