Senior Site Reliability Engineer at Uniphore developing cloud infrastructure and Go services. Collaborating with teams to ensure operational excellence and reliability.
Responsibilities
Review RFCs and PRDs to prevent downstream issues; provide architectural guidance during planning phases, including API design and service contract review
Design and build internal Go services, CLIs, and automation pipelines that replace manual processes and eliminate support dependencies
Design incident response frameworks, escalation procedures, and comprehensive playbooks; build tooling that automates runbook steps and accelerates MTTR , and participate in our on-call program
Define technical standards and operational frameworks, then codify them as enforced policy through admission controllers, operator logic, pipeline gates , etc. Automate policy enforcement not just authoring documentation.
Guide teams through ownership maturity, scorecard compliance, and operational best practices
Requirements
5+ years in DevOps/SRE roles with a track record of transforming operational models
Production Go experience : you write Go regularly, understand its concurrency model, and are comfortable owning Go services in production
Kubernetes depth: operational expertise plus the ability to extend it: you understand the controller-runtime model and could write or maintain a Kubernetes Operator
Production Excellence : deep incident management, RCA processes, and on-call system design experience
Software engineering fundamentals : API design, testing, observability instrumentation, and service lifecycle ownership; you treat internal tooling with the same rigor as customer-facing software
Standards & Documentation : strong technical writing; able to create operational procedures that teams can self-execute
Architecture & Planning : RFC/PRD review experience; you catch operational problems at design time
Collaboration & Coaching : track record of enabling team capabilities through tooling and knowledge transfer, not just doing work for teams
Site Reliability Engineer working on Linux systems for observability platforms and logging. Design and maintain applications, support network visibility, and collaborate with teams.
DevOps Engineer working at White Circle, focusing on infrastructure for AI systems. Involves managing production environments, Kubernetes, CI/CD pipelines, and automation tools.
Airflow Reliability Engineer on the Customer Reliability Engineering team at Astronomer. Working with clients on optimizing their use of the managed Airflow service in a hybrid role in Hyderabad.
Full - Stack Engineer enhancing engineering productivity at Fidelity. Building internal tools for SRE teams to improve operational efficiency and reliability.
DevOps Engineer at Cloudogu working with development and operations for reliable software delivery. Focusing on CI/CD, infrastructure automation, and platform services in an agile environment.
Jr. DevOps Engineer supporting and improving CI/CD pipelines and Linux systems at Swift. Collaborating with senior engineers in a hands - on learning environment.
Senior DevOps Engineer I managing automation tooling and multi - cloud infrastructure at Spring Health. Collaborating with AI and Infrastructure teams in a hybrid Seattle office.
Site Reliability Engineer for cloudified backup platform using Commvault technology at Expleo. Joining a dynamic team to ensure backup infrastructure scalability and reliability.
Site Reliability Engineer responsible for designing and maintaining scalable services with high availability. Collaborating with development teams to enhance reliability and operational excellence.