Site Reliability Engineer responsible for system reliability and performance at a leading financial services technology company. Collaborating with infrastructure, engineering, and security teams to build robust systems.
Responsibilities
Maintain and improve the uptime, performance, and availability of production systems.
Define and track SLIs , SLOs , and SLAs to ensure service reliability and user satisfaction.
Implement and manage monitoring, alerting, and observability tools (e.g., Prometheus, Grafana, Datadog, ELK).
Participate in on-call rotations and respond to incidents, performing root cause analysis and postmortems.
Automate repetitive tasks and processes using scripts, configuration management, and Infrastructure as Code (IaaC).
Develop CI/CD pipelines to streamline deployment and operational processes.
Analyze system performance and capacity trends to plan for future growth.
Collaborate with engineering teams to design systems that scale reliably.
Support cloud and/or hybrid infrastructure (AWS, Azure, GCP, VMware, etc.).
Manage system provisioning, configuration, and patching via tools such as Ansible, Terraform, or Puppet.
Act as a bridge between development and operations teams, championing DevOps and SRE principles.
Contribute to a culture of continuous improvement, reliability, and accountability.
Requirements
Bachelor’s degree in Computer Science, Engineering, or related field (or equivalent experience).
3+ years of experience in a Site Reliability, DevOps, or Systems Engineering role.
Experience with Linux/Unix systems , Windows , shell scripting, and administration.
Proficiency in at least one programming/scripting language (Python, Go, Bash, etc.).
Hands-on experience with cloud platforms ( AWS , Azure , or GCP ).
Strong knowledge of networking, security, load balancing, and DNS.
Experience with monitoring/logging tools (e.g., Prometheus, Grafana, ELK, Splunk, Datadog).
Benefits
Flexibility : Hybrid Work Model & a Business Casual Dress Code, including jeans
Your Future: 401k Matching Program, Professional Development Reimbursement
Work/Life Balance: Flexible Personal/Vacation Time Off, Sick Leave, Paid Holidays
Your Wellbeing: Medical, Dental, Vision, Employee Assistance Program, Parental Leave
Diversity & Inclusion: Committed to Welcoming, Celebrating and Thriving on Diversity
Training: Hands-On, Team-Customized, including SS&C University
Extra Perks: Discounts on fitness clubs, travel and more!
(Senior) DevOps Engineer specializing in ML solutions implementation and management in Germany. Focused on CI/CD pipelines, automation, and cloud services.
Specialist DevSecOps joining Periferia IT Group, a leader in digital transformation. Work in a dynamic environment with continuous learning and professional development opportunities.
Join Zinkworks as a Senior Platform Engineer designing scalable IaC - driven cloud platforms for a large - scale enterprise contact centre. Focused on automation, reliability, and platform ownership in a hybrid work environment.
Asset Reliability Engineer providing maintenance advice and service innovations. Join Sensorfact, the leading smart monitoring platform, to modernize the industrial sector.
Cloud Operations Engineer responsible for securing AWS infrastructure at Avalon Healthcare Solutions. Collaborating on SRE best practices and ensuring system reliability and performance.
Design Release Engineer designing, developing, and releasing seat systems for Ford vehicles. Ensuring engineering deliverables meet quality, cost, and timing targets while collaborating with cross - functional teams.
DevOps Engineer responsible for maintaining FME infrastructure and development pipelines at Safe Software. Collaborate in an agile team focused on constant improvement and automation.
Lead Site Reliability Engineer responsible for GCP cloud infrastructure and SRE practices. Join a fintech platform making real estate investment accessible globally.
Site Reliability Engineer managing stable, resilient applications with a focus on customer journeys. Collaborating with teams to ensure reliable service delivery and implementation of observability solutions.
Dev Ops Engineer at Netwealth, shaping and maintaining infrastructure for innovative financial technology. Collaborating across teams to automate processes and enhance observability in cloud and on - prem environments.