DevOps Engineer optimizing CI/CD and security infrastructure at Helpshift. Ensuring reliability and scalability while mentoring junior team members in a hybrid workplace.
Responsibilities
Design, implement, and maintain secure CI/CD pipelines for automating deployment, configuration, and testing processes.
Own Helpshift production services and ensure complete monitoring coverage, troubleshoot and fix production issues.
Build a seamless zero-downtime process to upgrade our core infrastructure (ScyllaDB, Elasticsearch, Kafka, MongoDB, Redis)Move us to a region with no downtime. Build a cloud infrastructure that’ll be easy to move to a different cloud service provider.
Collaborate with development and operations teams to integrate security practices into the software development lifecycle.
Conduct regular security assessments, vulnerability scans, and penetration testing to identify and mitigate security risks.
Develop and maintain infrastructure as code (IaC) templates for provisioning and configuring cloud resources securely.
Monitor and respond to production incidents, including investigation, containment, and remediation activities.
Stay up-to-date with the latest security threats, vulnerabilities, and best practices, and make recommendations for continuous improvement.
You will play a pivotal role in ensuring the security, scalability, and reliability of our infrastructure and applications.
You will collaborate closely with cross-functional teams to implement security best practices throughout the development lifecycle, automate security processes, and enhance our overall DevSecOps capabilities.
Mentor Junior Team members
Requirements
Relevant experience of 4+ years and above.
In-depth knowledge of running/managing UNIX-like operating systems (we use Ubuntu).
Strong knowledge of networking protocols, security architectures, and identity and access management (IAM) principles.
Experience with containerisation technologies (e.g., Docker, Kubernetes) and securing containerised environments.
Experience in Designing and building solutions that are highly scalable, fault tolerant and cost-effective
Experience of various FOSS tools for monitoring, graphing, capacity planning, and logging.
Experience with IaaC tools like Ansible, Puppet, Terraform.
Experience with Cloud Computing platforms like Amazon AWS, Google Cloud Platform, Heroku.
Experience with managing NoSQL and RDBMS
Experience with queuing systems (Kafka, RabbitMQ) and Big data platforms (Hadoop)
Good programming skills with focus on scripting (Python, Shell, Perl).
Ability to analyse bottlenecks in architecture and quickly debug to reach resolution for issues
Have an automation mindset and ability to reason and work with complex systems.
Excellent communication and documentation skills
Quick learner and good mentor for junior team members
Senior Platform DevOps Engineer at Code Metal designing and implementing cloud and hybrid infrastructure to support customer deployments and internal platforms. Collaborating with software and security teams for reliable delivery.
DevOps Platform Intern managing cloud infrastructure and deployment pipelines for AI - native software delivery. Partnering with a Product Development Intern, set up and manage containerized applications on Azure Kubernetes Service.
UNIX DevOps Engineer managing AIX and Solaris server operations for a Swiss telecom company. Focusing on automation, optimization and 7x24h monitoring responsibilities across multiple locations.
Staff Site Reliability Engineer designing and building backend services for NordVPN. High - ownership role focusing on system architecture and operational excellence.
Senior Site Reliability Engineer managing VPN and DNS services to ensure performance and reliability. Collaborating with application teams to maintain security and quality across global infrastructure operations.
Senior Site Reliability Engineer managing globally distributed VPN and DNS services. Optimizing service performance and handling security posture in a hybrid work environment.
Senior Site Reliability Engineer focused on observability for NordVPN. Designing monitoring systems and collaborating with data teams on anomaly detection.
Senior Site Reliability Engineer ensuring content accessibility across global edge infrastructure for NordVPN. Designing and troubleshooting systems critical to internet traffic management.
Staff Site Reliability Engineer designing tools for Threat Protection Pro and NordLynx protocol. Working on globally distributed backend services for NordVPN with a focus on security and privacy.
Senior Site Reliability Engineer focused on observability for cybersecurity tools at NordVPN. Designing monitoring systems and collaborating on anomaly detection within distributed systems.