DevOps Engineer developing and enhancing machine Learning infrastructure. Collaborating with AI teams to support ML projects in an Enterprise SaaS startup for contact centers.
Responsibilities
Design, build, and develop/enhance state of art machine Learning system infrastructure (cloud and on-premise) core components and architect platforms to create, train and deploy ML models.
Build operating dashboards and charts to track system errors, performance and enable root cause analysis.
Identify gaps and evaluate relevant tools and technologies as needed to improve processes and systems, leveraging open-source and cloud computing technologies to build effective solutions.
Collaborate with the AI team to drive ML projects from conception to completion and production monitoring.
Requirements
Bachelor's or above with a good academic background.
2-4 years of meaningful work experience in DevOps handling complex services.
Strong troubleshooting skills to keep our services highly available.
Strong expertise and experience with Google Cloud Platform (GCP), Docker, Kubernetes, CI/CD, and Jenkins.
Extensive experience in designing, implementing, and maintaining infrastructure as code, preferably using Terraform.
Create and maintain deployment manifest files for microservices using HELM.
Having LLMOps or MLOps experience is a bonus.
Strong expertise is required with deployment at scale on a Kubernetes cluster via HPA.
Broad technical background and experience with architecture, design, and operations of cloud solutions and how to meet security compliance requirements.
Monitoring system health, ensuring security, scalability, and reliability.
Design, implement, and maintain observability, monitoring, logging, and alerting using tools like Prometheus, Grafana, Promtail, Loki, and Datadog.
Benefits
market-leading compensation, based on the skills and aptitude of the candidate.
DevOps Engineer managing Kubernetes deployments for health tech company. Collaborating with engineering teams to enhance healthcare services using advanced technologies.
DevOps Engineer at PointClickCare, empowering innovative healthcare with Kubernetes and automation expertise. Work remotely while supporting crucial healthcare technology solutions.
Entry Level DevOps Engineer at Podimo, building scalable cloud infrastructure for a podcast platform. Collaborate with development teams and leverage AI tools to enhance the platform.
DevOps Engineer managing AWS infrastructure while contributing to backend code in Node.js and Python. Join Auterion building AI - powered software for autonomous systems.
Cloud DevOps Engineer managing Azure infrastructure at Medical Guardian. Overseeing technical operations and security response in a hybrid work environment.
SRE Linux/Unix System Administrator at Broadridge with strong Unix/Linux Bourne/Bash Scripting skills. Collaborating in a hybrid, fast - paced environment to manage critical systems.
Senior Site Reliability Engineer at Rootly embedding with teams to enhance service performance and reliability. Own CI/CD pipelines and drive capacity planning efforts in a fast - paced environment.
DevOps Engineer improving CI/CD pipelines and best practices for Datatonic's AI and data projects. Collaborate with clients to enhance infrastructure and drive innovation in tech.
Senior/Principal DevOps Engineer developing robust CI/CD pipelines for ClubWPT Gold at a hypergrowth startup. Collaborate globally to revolutionize online gaming experiences while maintaining high technical standards.
DevOps Engineer responsible for the health, performance, and automation of gaming platform services. Focused on CI/CD pipelines, infrastructure services, and application monitoring.