Reliability Engineer part of the core team ensuring safe and stable Autonomous Vehicle software releases. Collaborating across testing environments to improve reliability and data-driven insights.
Responsibilities
Own the reliability triage framework for the AV software stack, defining how failures from simulation, CI, and on-road validation are detected, categorized, and escalated into actionable insights.
Perform deep debugging and root-cause analysis across autonomy software, ML pipelines, and system integrations, connecting failure symptoms to clear solution paths and corrective actions.
Design and evolve automated triage mechanisms and reliability taxonomies, improving regression detection, flakiness identification, and signal quality as the system and models evolve.
Build and govern reliability data pipelines, providing continuous visibility into stability trends, recurrence patterns, and systemic risks that impact release readiness.
Translate reliability findings into decision-grade communication, influencing prioritization, technical debt reduction, and release confidence in partnership with engineering, safety, and systems stakeholders.
Requirements
Strong proficiency in Python and SQL for automation, analysis, and data pipelines
Proven experience with CI/CD systems (GitHub Actions, Jenkins, GitLab, or equivalent)
Hands-on experience implementing ETL/ELT pipelines for reliability, quality, or system health monitoring
Solid understanding of reliability engineering concepts, including regression tracking, flakiness detection, and failure classification
Strong analytical and cross-stack debugging skills in large-scale software systems
Experience integrating simulation, HIL, or system-level test signals into automated analysis workflows
Track record of effective cross-functional collaboration across engineering, QA, and platform teams
Ability to operate autonomously in high-ambiguity, safety-critical environments
Excellent communication skills for presenting data-driven reliability insights to engineering and technical leadership
Bachelor’s, Master’s, or PhD in Computer Science, Electrical Engineering, Robotics, or a related field—or equivalent experience.
Benefits
GM offers a variety of health and wellbeing benefit programs.
Benefit options include medical, dental, vision, Health Savings Account, Flexible Spending Accounts, retirement savings plan, sickness and accident benefits, life insurance, paid vacation & holidays, tuition assistance programs, employee assistance program, GM vehicle discounts and more.
Site Reliability Engineer at bsport scaling infrastructure and streamlining deployment processes. Responsible for managing reliability and CI/CD pipelines in a hybrid work environment.
Senior DevOps/Infra Engineer collaborating with top digital entertainment companies on impactful projects. Offering a blend of freelance flexibility and traditional employment security in Stockholm.
Senior Database Reliability Engineer enhancing MongoDB and PostgreSQL deployments at SS&C, a leader in financial services technology. Collaborating with teams to ensure operational reliability and mentor junior engineers.
DevOps Engineer at Smile enhancing performance and security for digital transformation projects. Collaborating on end - to - end solutions and driving operational efficiency in a digital environment.
DevOps Engineer managing automation lifecycle and technical infrastructure support for gaming company. Collaborating with IT Operations and business units to streamline issue resolution and enhance service quality.
DevSecOps Engineer responsible for CI/CD pipeline design, infrastructure automation, and ensuring operational reliability in a fast - growing AI startup.
DevOps Engineer defining DevOps strategies and collaborating with teams at Pacific Programming and Tech. Building infrastructure and processes for software solutions in a hybrid environment.
Senior DevOps Engineer managing Azure cloud infrastructure for AI solutions in healthcare. Architecting and maintaining multi - tenant Azure environments while ensuring compliance and security.
DevSecOps Engineer modernizing multi - cloud environments for Leidos. Collaborating across AWS, Azure, Google, and Oracle clouds to support mission - critical systems.
Senior DevOps Engineer at Leidos contributing to mission - critical programs for national security. Focusing on platform architecture, automation, and cloud infrastructure solutions.