Site Reliability Engineer responsible for building and maintaining cloud infrastructure at Tricentis. Collaborating with product engineers and enhancing operational processes for seamless scaling with innovative solutions.
Responsibilities
Design, build, and maintain the product cloud infrastructure that enables seamless scaling
Develop advanced monitoring systems that proactively alert on symptoms
Leverage tools like Terraform, GitHub actions, and Kubernetes to efficiently manage AWS or AZURE infrastructure
Continuously enhance operational processes, making deployments, upgrades, and other tasks as boring and automated as possible
Collaborate with product engineers on daily basis and influence product architectures designs
Be part of an on-call rotation to respond swiftly to incidents affecting availability
Act as a reliability champion for stable counterpart assignments
Propose innovative ideas and solutions within the SRE organization and engineering
Proactively identify opportunities to enhance system availability and performance
Share learnings with the wider community
Be the first responder during emergencies and on-call duties
Requirements
Proficiency in Terraform syntax and GitHub Actions configuration
Working knowledge of SaaS architecture concepts and designs
Understanding of Kubernetes, including CLI usage and service re-provisioning
Ability to provision and set up metrics along with managing alerts and silences
Identify Service Level Indicators (SLIs) that align the team with availability and latency objectives
Experience with Linux operating system configuration, package management, and troubleshooting
Working experience with cloud environments like AZURE or AWS and provisioning infrastructure there
Good cultural fit: clear communication, empathy, curiosity & continuous learning, no blame attitude, but instead supportive
DevOps Engineer automating and configuring network monitoring and automation solutions for Telia’s telecom operations in Finland. Ensuring performance, resilience, and high observability of critical platforms.
Client Services Consultant specializing in DevOps Mainframe Operations with experience in automation best practices. Analyzing Life Cycle Management data needs and evaluating solutions for Endevor - related operations.
Senior AWS DevOps Engineer at LexisNexis shaping global CI/CD platform. Collaborating with teams to deliver secure, reliable, and scalable delivery pipelines.
Cloud Engineer at MetroStar focusing on building and securing cloud - native systems. Managing Kubernetes workloads and CI/CD pipelines in Agile teams with an emphasis on security.
Senior Engineer Cloud Engineering role focused on AWS migration and automation. Collaborating with teams to innovate cloud patterns and infrastructure best practices.
Senior Operations Engineer driving efficiency and reliability in NVIDIA's global business operations. Collaborating with IT subsystems and automating operational workflows for organizational impact.
Lead or Senior DevOps Developer joining Boeing Defense, Space and Security for advanced technology missions. Involves CI/CD, cloud systems design, and collaboration with government customers.
Site Reliability Engineer ensuring high availability and performance for digital platforms in retail. Collaborating with engineering teams for automation and observability practices.
Associate Site Reliability Engineer supporting the reliability and performance of global IT infrastructure at Exegy. Engage with senior engineers and learn foundational systems engineering skills.
Site Reliability Engineer driving innovation and growth for Banking Solutions, Payments, and Capital Markets business. Responsible for application reliability and incident response in a hybrid work environment.