Staff Site Reliability Engineer – Observability at CVS Health | Hybrid Hired

About the role

Staff Site Reliability Engineer focusing on observability at CVS Health. Leading design and implementation of observability systems across distributed environments and edge computing.

Responsibilities

Lead the design, implementation, and optimization of observability systems
Collaborate with cross-functional teams to build robust monitoring, alerting, and telemetry solutions
Drive best practices, mentor others, and shape the strategic evolution of our observability ecosystem
Design and implement comprehensive observability solutions tailored for edge computing environments
Define and maintain Service Level Indicators (SLIs), Service Level Objectives (SLOs), and business KPIs
Build and optimize dashboards, visualizations, and alerting systems
Implement distributed tracing and log aggregation systems
Collaborate with engineering teams to ensure applications and infrastructure at edge locations are designed with observability in mind
Drive proactive identification of issues in edge facilities
Lead incident postmortems and implement observability-driven improvements
Develop and maintain tools, scripts, and automation to enhance observability pipelines
Evaluate and integrate industry-standard observability tools

Requirements

7+ years of experience in Site Reliability Engineering, Observability Engineering, or a related field
5+ years of experience with observability tools and platforms such as Prometheus, Grafana, Splunk, ELK, OpenTelemetry, or similar
3+ years of experience with microservices, containerized environments (e.g., Kubernetes, Docker), and distributed systems, particularly in edge deployments
Experience with implementation of AIOps
Strong proficiency in programming/scripting languages (e.g., Python, java) for automation and tooling in distributed environments
Certifications in cloud platforms (Google Cloud Professional certification) or Kubernetes
Knowledge of incident management processes and tools (e.g., ServiceNow, xMatters, Opsgenie) tailored for distributed systems

Benefits

Affordable medical plan options
401(k) plan (including matching company contributions)
Employee stock purchase plan
No-cost programs including wellness screenings, tobacco cessation and weight management programs, confidential counseling and financial coaching
Paid time off
Flexible work schedules
Family leave
Dependent care resources
Colleague assistance programs
Tuition assistance
Retiree medical access

Similar roles

Browse all Devops Engineer jobs

6 minutes ago

TA

Director, Site Reliability Engineering

TASC

Director of Site Reliability Engineering at Mastercard, overseeing resilience and operational excellence initiatives. Leading a high - performing team of technical leaders within CX Technology.

Onsite Role

Pune India Devops Engineer

27 minutes ago

TR

Lead Site Reliability Engineer

Tricentis

SRE responsible for designing and maintaining cloud infrastructure to support scalable applications. Collaborating with product teams to enhance monitoring and response systems in the Czech Republic.

Hybrid Role

Prague Czechia Devops Engineer

52 minutes ago

WA

Vehicle Reliability Engineer

Waabi

Vehicle Reliability Engineer identifying and resolving issues for Waabi, a leader in Physical AI for autonomous transportation. Collaborating across teams to enhance vehicle reliability and performance.

Hybrid Role

Dallas United States Devops Engineer

1 hour ago

CO

DevOps Engineer

Coins.ph

DevOps Engineer responsible for maintaining cloud infrastructure at the leading crypto brand in the Philippines. Collaborating with legal and compliance teams to ensure requirements are met while monitoring and troubleshooting systems.

Hybrid Role

Taguig City Philippines Devops Engineer

1 hour ago

E.

Senior DevSecOps Engineer

E.ON

Senior DevSecOps Engineer at E.ON Digital Technology developing scalable DevOps platforms in hybrid model with a focus on automation and security.

Hybrid Role

Essen Germany Devops Engineer

2 hours ago

BS

Tech Lead SRE – Hybrid

Beyond Soluções

Tech Lead SRE managing technology talent and connecting them to impactful projects in a healthy work environment. Seeking professionals with a solid technical foundation and product mindset.

Hybrid Role

São Paulo Brazil Devops Engineer

3 hours ago

EG

Senior DevOps Engineer – Environments

Expleo Group

Senior DevOps Engineer modernising environment landscapes through IaC and SRE principles while collaborating across teams for a global engineering firm.

Hybrid Role

London United Kingdom Devops Engineer

4 hours ago

WA

DevOps Specialist – SRE

WayCarbon

DevOps Specialist at WayCarbon architecting and managing infrastructure for web applications. Focused on supporting a sustainable Net - Zero economy with a diverse tech team.

Hybrid Role

Belo Horizonte Brazil Devops Engineer

6 hours ago

UE

Platform Intern, SRE

UOL EdTech

Intern assisting with cloud infrastructure automation for educational technology company UOL EdTech. Collaborating with teams on database operations and cloud deployment tasks.

Hybrid Role

São Paulo Brazil Devops Engineer

9 hours ago

GI

IT Infrastructure Coordinator – DevOps, Azure, Office 365

Grupo Iter

IT Infrastructure Coordinator leading teams in DevOps, Azure, and Office 365 for Grupo Iter's IT infrastructure management. Ensuring operational efficiency and technology evolution.

Hybrid Role

Rio de Janeiro Brazil Devops Engineer