Site Reliability Engineer working on cloudification of backup services at Expleo. Contributing to infrastructure evolution with a team of skilled engineers.
Responsibilities
Join our dynamic Backup Services team as a Site Reliability Engineer working on the exciting cloudification of our backup platform using Commvault technology.
You'll be part of a growing team of 5-10 highly skilled engineers within the Cloud Services division, contributing to the evolution of our critical backup infrastructure.
Design and implement reliable, scalable backup infrastructure solutions using Commvault in cloud environments
Lead new backup platform deployment project
Monitor, troubleshoot, and optimize backup platform performance and availability
Collaborate with cross-functional teams to ensure backup service reliability and disaster recovery capabilities
Implement automation and best practices to enhance operational efficiency and system resilience
Participate in on-call rotations and incident response to maintain 24/7 service availability
Requirements
Academic or Bachelor level education or equivalent experience
Minimum 3-5 years of Linux experience in a HA environment
Self-starter, autonomous
Comfortable with development/programming skills
Ability to engage with both technical and non-technical staff at all levels in the organization
Quick learner
Ability to make your way through a complex automation stack autonomously
Automation is your answer to (almost) everything
Open source enthusiast
Believe in Infrastructure as Code
Target zero support solutions and self-healing systems
Security minded
Ability to find trade-offs between: stability vs. agility, operational work vs. software engineering, proactive vs. reactive work
Technical knowledge of Commvault or another major backup solution
Excellent knowledge of Red Hat / CentOS Linux
Experience with source version control like Git, GitLab, Bitbucket, GIT flows and CI/CD
Experience with Puppet
Experience with Terraform
Comfortable with scripting (Bash, Python, Ruby, Puppet, CI/CD)
Understanding of network protocols (IP, DHCP, DNS, BGP, load balancing)
Experience with programming languages (Java, Golang, Python)
Experience with multi-DC setup in different countries
Experience with hybrid infrastructure (on premises + cloud)
Experience with security standards (PCI-DSS)
Excellent communication skills in English
Proficiency in at least one programming language (Python, Go, Ruby, Perl)
Excellent problem-solving and analytical skills
Strong communication and collaboration skills
Benefits
Holiday Voucher
Private medical insurance
Performance bonus
Easter and Christmas bonus
Employee referral bonus
Bookster subscription
7card
Work from home options depending on project
Job title
Senior Site Reliability Engineer – Backup Services
DevOps Engineer responsible for building and maintaining scalable AI systems on Azure cloud. Collaborating with teams to ensure operational excellence for enterprise - grade AI solutions.
Junior MLOps Engineer helping to design and maintain AI/ML systems at Bupa. Collaborating with teams to operationalize machine learning models and automate workflows.
DevOps Engineer developing and managing scalable AWS infrastructures for a PropTech startup. Collaborating within a growing tech team to achieve ambitious goals in the legal conveyancing space.
Senior DevOps Engineer leading the design and optimization of cloud infrastructure at Growth Acceleration Partners. Ensuring secure and cost - effective deployments within fast - paced product development environment.
Advanced Dev Ops Engineer optimizing infrastructure solutions for engineering teams at a consulting and technology services company. Ensuring secure and cost - effective deployments in a fast - paced environment.
Entry - level DevOps Engineer at Nokia focusing on building and maintaining CI environment for LTE and 5G solutions. Engage with high - end telecommunication technologies and support development workflows.
AI Security Control Developer/Site Reliability Engineer for RBC's enterprise AI ecosystem. Design, implement, and validate security controls to protect AI systems with 24/7 reliability.
Senior Site Reliability Engineer ensuring scalability and reliability for NGINX systems and SaaS platforms. Collaborating across teams to drive automation and system performance.
Site Reliability Engineer ensuring reliability and performance of data platform services for Veepee. Collaborating on cloud migration, Kubernetes operations, and observability best practices.
Senior Lead Site Reliability Engineer overseeing critical systems stability and incident management. Leading Java applications reliability and supporting a dynamic technology environment.