Site Reliability Engineer at Meniga, enhancing cloud infrastructure for digital banking solutions. Focus on reliability, scalability, and collaborative teamwork in a hybrid role.
Responsibilities
Design and evolve highly reliable, scalable, and resilient services running on Microsoft Azure and Google Cloud
Define, implement, and enforce reliability and operational standards for Infrastructure as Code (IaC)
Automate provisioning, configuration, deployment, and lifecycle management of cloud infrastructure
Design and operate scalable infrastructure platforms that meet defined Service Level Objectives (SLOs) and Service Level Agreements (SLAs)
Establish and maintain observability solutions (monitoring, logging, alerting, tracing)
Participate in and improve incident response processes
Collaborate closely with software engineering teams and enterprise stakeholders
Continuously assess and improve system reliability through various strategies
Drive adoption of modern reliability practices and tooling
Requirements
8+ years of IT experience
5+ years as an SRE, DevOps Engineer, or similar role
Strong knowledge of cloud technologies (Azure, GCP, AWS) and on-premises infrastructure
Advanced knowledge of Infrastructure as Code tools (e.g. Terraform)
Strong understanding of DevOps practices, CI/CD pipelines, and GitOps workflows
Expertise in GitOps tools, specifically hands-on experience with FluxCD
Experience with virtualization technologies and container orchestration (e.g., Kubernetes, Docker)
Proficiency in modern observability and monitoring, specifically the LGTM stack (Loki, Grafana, Tempo)
Familiarity with network architecture, security protocols, and data storage solutions
Strong analytical and problem-solving skills, with the ability to manage multiple projects simultaneously
Bachelor's degree in Computer Science, or a related field
Advanced English, confident to actively participate in global client meetings
Full - Stack Engineer enhancing engineering productivity at Fidelity. Building internal tools for SRE teams to improve operational efficiency and reliability.
DevOps Engineer at Cloudogu working with development and operations for reliable software delivery. Focusing on CI/CD, infrastructure automation, and platform services in an agile environment.
Jr. DevOps Engineer supporting and improving CI/CD pipelines and Linux systems at Swift. Collaborating with senior engineers in a hands - on learning environment.
Senior DevOps Engineer I managing automation tooling and multi - cloud infrastructure at Spring Health. Collaborating with AI and Infrastructure teams in a hybrid Seattle office.
Site Reliability Engineer for cloudified backup platform using Commvault technology at Expleo. Joining a dynamic team to ensure backup infrastructure scalability and reliability.
Site Reliability Engineer responsible for designing and maintaining scalable services with high availability. Collaborating with development teams to enhance reliability and operational excellence.
Technical Staff leading the architecture, reliability, and modernization of enterprise ALM and DevOps tools. Driving strategy and influencing product development in collaboration with various teams.
Site Reliability Engineer responsible for reliability and availability, collaborating with development teams on scalable systems. Applying software engineering practices to improve production operations.
DevOps Engineer in the Security Data and AI Lab at Lloyds Banking Group driving data and cloud infrastructure's influence on product operations and customer service improvements.