Manager leading a team of DevOps engineers and shaping cloud infrastructure strategy at a technology company in India.
Responsibilities
Lead, manage, and grow a team of DevOps engineers (FTEs and contractors), overseeing day-to-day delivery, performance reviews, and career development.
Establish clear ownership, accountability, and a high-performance culture within the DevOps function.
Drive training and up-skilling initiatives across key areas such as Kubernetes, Terraform, and GCP to keep the team current and effective.
Mentor senior engineers and support their growth into technical leadership roles.
Own and evolve the organization’s cloud infrastructure strategy across AWS and GCP, ensuring platforms are scalable, secure, and cost-effective.
Oversee and architect large-scale migrations and infrastructure modernization programs, including cloud platform transitions and GitHub Enterprise adoption.
Set strategic priorities and roadmaps for reliability, automation, observability, and infrastructure improvements aligned with business objectives.
Collaborate with engineering and product leadership to define infrastructure requirements for new platforms and product initiatives.
Establish and lead a dedicated SRE function within the DevOps team, driving ownership of uptime, incidents, and on-call practices.
Oversee the full incident management lifecycle, including on-call processes, RCA sign-off, corrective actions, and preventive measures to improve MTTR.
Define and enforce SLOs, SLIs, and error budgets to maintain high service availability.
Standardize DevOps workflows and tooling across planning, alerting, and incident management platforms.
Define and govern CI/CD standards and pipeline architecture across the organization, ensuring reliable and consistent deployments.
Champion the use of AI-assisted development tools and automation to reduce toil and accelerate delivery velocity.
Oversee container orchestration strategy using Kubernetes (EKS, OpenShift) and ensure best practices for containerized workloads.
Drive Infrastructure as Code (IaC) adoption using Terraform and Ansible to maintain consistent, auditable environments.
Own the organization’s observability strategy, driving adoption of monitoring, logging, and alerting solutions across all platforms.
Lead technology audit and compliance programs aligned to ISO certification standards.
Partner with security teams to embed DevSecOps practices into pipelines and infrastructure provisioning.
Work closely with leadership to communicate risks, trade-offs, and timelines in a clear, actionable manner.
Requirements
10+ years of hands-on experience in DevOps, platform engineering, or site reliability engineering, with at least 2 years in a people management role.
Proven experience managing cross-functional teams including full-time engineers and contractors.
Deep expertise in AWS (20+ services) and hands-on experience with GCP; familiarity with Azure or other cloud platforms is advantageous.
Strong proficiency in container orchestration using Kubernetes (AWS EKS, GKE) and Docker.
Hands-on expertise with Infrastructure as Code tools, particularly Terraform and Ansible.
Demonstrated experience designing and managing CI/CD pipelines using tools such as Jenkins, ArgoCD, GitHub Actions, or GitLab CI.
Experience establishing and running SRE functions, including on-call frameworks, incident management, and RCA processes.
Proficiency in observability tooling including Grafana Stack (Grafana, Loki, Mimir), ELK/OpenSearch, and AWS CloudWatch.
Strong scripting and automation skills in Python and Shell.
Experience leading or contributing to technology audits and compliance initiatives (e.g., ISO certifications).
Excellent communication skills with the ability to explain technical concepts and risks to non-technical stakeholders and senior leadership.
Experience with project and service management tooling such as JIRA, PagerDuty or equivalent platforms.
DevOps Developer coordinating IT support and developing pipelines and delivery processes for Saab. Focused on collaboration, technical solutions, and communication to achieve high - quality results.
Senior Infrastructure Engineer focused on design automation and software infrastructure at Intel Foundry. Collaborating with development teams to improve reliability and velocities in engineering processes.
Site Reliability Engineer at Personio focusing on automated infrastructure and collaboration across engineering teams. Shape the future of HR technology with meaningful impact and ownership.
Site Reliability Engineering Senior Manager leading multiple SRE teams at Netwealth. Shaping strategy and operational practices in a collaborative environment.
DevOps Engineer automating software development lifecycle in multi - cloud Kubernetes environments. Building and maintaining DevSecOps pipeline using Infrastructure as Code and modern tools.
DevOps Engineer responsible for automating DevSecOps processes and improving software development life cycle in a multi - cloud Kubernetes environment. Collaborating with a team for tool building in the intelligence community.
DevOps Engineer managing Kubernetes deployments for health tech company. Collaborating with engineering teams to enhance healthcare services using advanced technologies.
DevOps Engineer at PointClickCare, empowering innovative healthcare with Kubernetes and automation expertise. Work remotely while supporting crucial healthcare technology solutions.
Entry Level DevOps Engineer at Podimo, building scalable cloud infrastructure for a podcast platform. Collaborate with development teams and leverage AI tools to enhance the platform.