Staff SRE Tech Lead overseeing platform reliability and scalability at Unify. Leading an SRE pod while enhancing infrastructure performance and implementing reliability best practices.
Responsibilities
Lead the SRE pod: Set technical direction, drive prioritization, and mentor engineers—ensuring the team is tackling the highest-leverage reliability and scalability challenges.
Scale our data infrastructure: Architect and extend our ClickHouse and PostgreSQL deployments to handle terabytes of new data monthly; designing partitioning strategies, tuning queries, and building resilient replication and failover systems.
Improve system performance: Profile and optimize critical paths across our backend services, identify bottlenecks in data pipelines and API layers, and ship changes that meaningfully improve latency and throughput.
Build for reliability: Design and implement rate limiting, circuit breakers, graceful degradation, and other patterns that keep the platform stable under load and during partial failures.
Automate everything: Drive tooling that eliminates toil—automating deployments, scaling operations, backup verification, and incident remediation.
Instrument and observe: Build out distributed tracing, metrics, and alerting that give engineers clear visibility into system behavior and make debugging production issues fast.
Define and enforce SLOs: Establish reliability targets aligned with customer needs, manage error budgets, and drive architectural decisions that balance shipping speed with system stability.
Requirements
8+ years of software engineering experience with a strong backend foundation, including 3+ years focused on reliability, infrastructure, or platform work.
Experience leading teams or pods—setting technical direction, mentoring engineers, and driving execution on complex projects.
Deep expertise operating databases at scale, including schema design, query optimization, replication, and failover strategies.
Strong programming skills (Typescript, Python, Go, or similar) with a track record of building automation and tooling that meaningfully reduces operational burden.
Collaborative, low-ego attitude with a history of leveling up the people around you.
DevSecOps Consultant integrating security into IT development and operational processes. Advising clients on seamless integration of security requirements into DevOps workflows.
DevOps Engineer designing, developing and supporting programs at Swift, the leading provider of secure financial messaging services. Involves system analysis, program development and team collaboration.
Senior DevSecOps Engineer delivering complex software applications with a talented team in the defense sector. The role requires strong Kubernetes and cloud platform knowledge.
Senior Infrastructure/DevSecOps Engineer delivering complex software applications. Collaborating with a talented team to enhance national security efforts at CACI.
Staff Infrastructure/DevSecOps Engineer delivering complex software applications in collaboration with a talented team. Drive innovation and support national missions at CACI with a commitment to integrity.
Platform DevOps Engineer at Booz Allen Hamilton developing and managing container platforms for cloud capabilities. Collaborating to improve client environments using the latest cloud technologies.
DevOps Engineer enhancing reliability and performance of Ciena's Blue Planet applications in cloud environments. Implementing automation and upgrade strategies for seamless delivery of services.
Site Reliability Engineer working on cloudification of backup services at Expleo. Contributing to infrastructure evolution with a team of skilled engineers.
Senior DevOps Engineer working on deployment and operations of FedRAMP authorized products. Improve cloud infrastructure and collaborate with federal customers in a regulated environment.