Senior DevOps Engineer modernising environment landscapes through IaC and SRE principles while collaborating across teams for a global engineering firm.
Responsibilities
Design, implement, and maintain Infrastructure-as-Code (IaC) for consistent and repeatable provisioning of development and test environments, primarily using Terraform.
Lead technical investigations and act as the escalation point for environment-related incidents, outages, configuration issues, and service degradation across non-production platforms.
Collaborate closely with development, QA, and platform teams to deliver scalable, automated, and resilient environment solutions.
Analyse and optimise performance of non-production systems, identifying and resolving environment bottlenecks.
Maintain environment fidelity and integrity through controlled configuration management, patching, visioning, and rollback strategies.
Support release and deployment planning, ensuring environment readiness, dependency alignment, and overall stability during release cycles.
Implement and maintain monitoring, observability, and logging frameworks, with a strong emphasis on Dynatrace and CNCF-aligned tooling.
Define meaningful, proactive alerting policies that reduce noise, highlight real issues, and accelerate response times.
Apply SRE principles such as SLIs/SLOs, automated remediation, and continuous feedback loops to improve environment uptime and reliability.
Mentor junior engineers, share best practices, and contribute to knowledge bases, documentation, and process maturity.
Support Disaster Recovery (DR) testing, validating end‑to‑end system recovery, integration behaviour, and service resilience during failover scenarios.
Champion automation and operational excellence, reducing manual effort and increasing the team’s ability to deliver environments at scale.
Requirements
Strong knowledge of VMware, vSphere, virtualisation platforms, and on‑premise infrastructure management.
Expertise in Terraform and experience defining an organisation-wide IaC strategy.
Proficient in scripting and automation (Python, Bash, PowerShell).
Strong communication, documentation, and collaborative problem-solving skills.
Hands-on experience with on-premise infrastructure, virtualisation, containerisation, and exposure to cloud platforms such as AWS or Azure.
Understanding of performance engineering, including load testing frameworks and performance analysis.
Experience supporting QA, development, and release management teams with reliable, well-controlled non-prod environments.
Ability to troubleshoot complex multi‑layered issues across infrastructure, networks, applications, middleware, and databases.
Familiarity with SRE principles and modern operational practices such as postmortems, runbooks, SLIs/SLOs, error budgets, and automated recovery patterns.
Experience with APM and observability tooling, ideally Dynatrace, including metrics, traces, dashboards, and alerting configuration.
Benefits
Collaborative working environment – we stand shoulder to shoulder with our clients and our peers through good times and challenges
We empower all passionate technology loving professionals by allowing them to expand their skills and take part in inspiring projects
Expleo Academy - enables you to acquire and develop the right skills by delivering a suite of accredited training courses
Competitive company benefits
Always working as one team, our people are not afraid to think big and challenge the status quo
As a Disability Confident Committed Employer we have committed to: Ensure our recruitment process is inclusive and accessible Communicating and promoting vacancies Offering an interview to disabled people who meet the minimum criteria for the job Anticipating and providing reasonable adjustments as required Supporting any existing employee who acquires a disability or long term health condition, enabling them to stay in work at least one activity that will make a difference for disabled people
“We are an equal opportunities employer and welcome applications from all suitably qualified persons regardless of their race, sex, disability, religion/belief, sexual orientation or age”. We treat everyone fairly and equitably across the organisation, including providing any additional support and adjustments needed for everyone to thrive
DevOps Engineer managing Kubernetes deployments for health tech company. Collaborating with engineering teams to enhance healthcare services using advanced technologies.
DevOps Engineer at PointClickCare, empowering innovative healthcare with Kubernetes and automation expertise. Work remotely while supporting crucial healthcare technology solutions.
Entry Level DevOps Engineer at Podimo, building scalable cloud infrastructure for a podcast platform. Collaborate with development teams and leverage AI tools to enhance the platform.
DevOps Engineer managing AWS infrastructure while contributing to backend code in Node.js and Python. Join Auterion building AI - powered software for autonomous systems.
Cloud DevOps Engineer managing Azure infrastructure at Medical Guardian. Overseeing technical operations and security response in a hybrid work environment.
SRE Linux/Unix System Administrator at Broadridge with strong Unix/Linux Bourne/Bash Scripting skills. Collaborating in a hybrid, fast - paced environment to manage critical systems.
Senior Site Reliability Engineer at Rootly embedding with teams to enhance service performance and reliability. Own CI/CD pipelines and drive capacity planning efforts in a fast - paced environment.
DevOps Engineer improving CI/CD pipelines and best practices for Datatonic's AI and data projects. Collaborate with clients to enhance infrastructure and drive innovation in tech.
Senior/Principal DevOps Engineer developing robust CI/CD pipelines for ClubWPT Gold at a hypergrowth startup. Collaborate globally to revolutionize online gaming experiences while maintaining high technical standards.
DevOps Engineer responsible for the health, performance, and automation of gaming platform services. Focused on CI/CD pipelines, infrastructure services, and application monitoring.