About the role

Director of Site Reliability Engineering at Mastercard, overseeing resilience and operational excellence initiatives. Leading a high-performing team of technical leaders within CX Technology.

Responsibilities

Lead and develop a team of highly skilled people leaders and senior individual contributors within the CX Technology organization, fostering a culture of accountability, innovation, and continuous improvement
Define and drive the short-term and medium-term strategic vision for Site Reliability Engineering, aligning reliability, scalability, and operational efficiency initiatives with broader Mastercard technology and business objectives
Lead the design and execution of cross-functional initiatives that improve system resilience, automate operational processes, and mature incident management, problem management, and reliability engineering practices
Establish, evolve, and govern reliability standards, operational best practices, and control frameworks to ensure consistent adoption across engineering and delivery teams
Partner closely with engineering, product, architecture, and business stakeholders to embed reliability requirements into system design, development, deployment, and lifecycle management processes
Oversee major incident response and escalation efforts, ensuring rapid recovery, effective communication, and high-quality root cause analysis with actionable remediation
Promote proactive risk identification and mitigation through observability, capacity planning, resiliency testing, and automation-driven approaches
Champion continuous improvement by leveraging operational metrics, insights, and retrospectives to drive measurable improvements in availability, stability, and customer experience
Stay informed on industry trends, emerging technologies, and modern SRE practices, applying relevant innovations to advance Mastercard’s operational maturity
Manage goal setting, coaching, performance management, and talent development for people leaders and senior technologists, building a strong leadership pipeline and sustaining operational excellence at scale.

Requirements

Proven experience leading Site Reliability Engineering, Production Engineering, or large-scale operations teams within complex, highly available, distributed technology environments
Strong people leadership background, including managing managers and/or senior technical leaders, with demonstrated success building high-performing, inclusive teams
Deep understanding of reliability engineering principles, including incident management, automation, telecom, observability, resilience engineering, capacity planning, and service lifecycle management
Demonstrated ability to translate strategy into execution by evolving processes, programs, and policies to drive meaningful and measurable operational improvements
Experience partnering across engineering, product, and business functions to influence design decisions and embed reliability throughout the development lifecycle
Strong analytical and problem-solving skills, with a track record of driving root cause analysis and long-term corrective actions
Excellent communication and stakeholder management skills, with the ability to lead through influence at senior and executive levels
Passion for continuous improvement, operational discipline, and leveraging technology to reduce toil and improve system outcomes at scale
Bachelor’s degree in Computer Science, Engineering, or equivalent practical experience; advanced degree preferred.

Benefits

Must abide by Mastercard’s security policies and practices
Ensure the confidentiality and integrity of the information being accessed
Report any suspected information security violation or breach
Complete all periodic mandatory security trainings

Onsite Director, Site Reliability Engineering

at TASC

About the role

Responsibilities

Requirements

Benefits

Job title

Job type

Experience level

Salary

Degree requirement

Location requirements

Report this job

Similar roles

Principal Full Stack Engineer – SRE

skillventory - A Leading Talent Research Firm

DevOps Engineer – m/w/d

Cloudogu GmbH

Junior DevOps Engineer

Swift

Senior DevOps Engineer

Spring Health

Senior Site Reliability Engineer – Backup

Expleo Group

Performance & Reliability Engineer

Expleo Group

Senior Site Reliability Engineer – Storage

Expleo Group

Technical Staff – ALM & DevOps Platforms

Metsi Technologies

Senior Site Reliability Engineer

SAN R&D Business Solutions

DevOps Engineer, Security Data and AI Lab

Lloyds Banking Group