Senior Site Reliability Engineer at Getrak responsible for platform reliability and monitoring in critical environments. Collaborating with engineering teams on performance and observability across a major SaaS platform.
Responsibilities
Define, implement, and monitor SLIs/SLOs for availability, latency, and reliability.
Design and optimize CI/CD pipelines for microservices in high-availability environments.
Manage and evolve infrastructure on AWS (EC2, ECS/EKS, S3, RDS, CloudFront, VPC, IAM, CloudWatch, etc.).
Manage distributed databases and critical systems: Astra DB / Cassandra (DataStax), Redis, and RabbitMQ.
Automate provisioning, configuration, and scalability with Terraform, Ansible, or similar tools.
Develop and maintain observability practices (metrics, logs, tracing) using DataDog and related tools.
Lead investigations into critical incidents, proposing definitive solutions (blameless postmortems).
Work on cloud cost optimization, balancing reliability and budget.
Ensure infrastructure security and compliance, with access policies, backups, and continuous auditing.
Collaborate with engineering and product teams, bringing a reliability mindset to the development cycle.
Requirements
6+ years of SRE/DevOps experience in high-scale, mission-critical environments.
Strong expertise in AWS and cloud-native architecture.
Advanced knowledge of Cassandra (Astra DB / DataStax), Redis, and RabbitMQ.
Experience with microservices and containerization (Docker, Kubernetes, ECS/EKS).
Senior DevSecOps Engineer/Developer responsible for building Humana's software security platform. Modernizing architecture and managing CI/CD pipelines as part of core engineering team.
Senior Information Security Analyst focusing on DevSecOps for Unidas, a major mobility company in Brazil. Responsible for optimizing security governance processes and delivering secure software.
DevOps Manager overseeing scaling for Seekr's AI platform using Kubernetes, Terraform, and Ansible. Leading a hands - on team and collaborating with engineering for efficiency.
Back - End & DevOps Software Developer contributing to building digital products to change the world. Specializing in back - end development and command of DevOps ecosystem for robust infrastructure.
Lead DevOps Developer at Boeing, focusing on CI/CD and cloud infrastructure management. Collaborating with teams to automate processes and improve system performance across environments.
Vulnerability & Configuration Management Engineer responsible for vulnerability management and remediation processes at Relax Gaming. Collaborate with IT teams to improve security measures across various platforms.
DevOps Engineer for designing and maintaining Azure - based hybrid cloud infrastructure for a company specializing in nature - based smart city solutions. Leading cloud architecture and mentoring engineers as part of a high - impact team.
SRE responsible for ensuring reliability and performance of IT systems at a digital transformation company specializing in public sector efficiency. Collaborating on system health, incident response, and automation tasks.
DevOps Senior role at Beyond Soluções managing CI/CD for .NET and Kubernetes applications. Collaborating on cloud solutions while fostering a culture of innovation and quality.
Senior Software Engineer at PayPal managing cloud infrastructure and DevOps solutions. Delivering complete SDLC solutions and guiding engineering teams for scalable and reliable services.