Cloud Platform Engineer (ML DevOps) developing and managing CI/CD pipelines for ML workflows in a leading insurance company. Collaborating with data scientists and ensuring infrastructure security and compliance.
Responsibilities
Designs, builds, and maintains infrastructure for ML experimentation, model training, and deployment.
Develops and manages CI/CD pipelines for ML workflows (data ingestion, model training, testing, and deployment).
Implements and manages ML platforms (e.g., Azure MLStudio, Fabric, MLflow, Kubeflow, SageMaker, Vertex AI) to support reproducibility and scalability.
Creates tools and environments to automate data versioning, model tracking, and artifact management.
Collaborates with data scientists to enable self-service access to compute resources and production systems.
Monitors, logs, and alerts on ML system health and model performance in production.
Enforces MLOps best practices across teams, including governance, model validation, and rollback strategies.
Ensures infrastructure security, cost-efficiency, and compliance.
Practices daily paired programming and test-driven development in writing software and building product.
Participates in executing the strategy, keeping the customer needs and wants in mind.
Requirements
4+ years of experience with software development languages such as Python, Java.
4+ years of experience with Cloud Technologies such as Azure and AWS.
4+ years of experience with DevOps.
4+ years of experienced with Infrastructure as Code technologies such as Terraform, Ansible, Chef or Puppet.
Exposure to machine learning frameworks and distributed data processing tools like Apache Spark or equivalents.
Senior Site Reliability Engineer improving the reliability of Acuity’s cloud services. Collaborating across teams to define observability standards and incident response in Cork Digital Centre of Excellence.
Azure Senior DevOps Engineer supporting critical cloud systems in the Azure Government Cloud environment. Leading CI/CD pipeline design and implementation with operational best practices.
Automation Engineer enhancing infrastructure and automating operations for client systems. Working in a complex environment oriented towards automation, security, and performance.
Graduate Reliability Engineer at GKN Aerospace enhancing operational excellence through data analysis and project participation within large structural assemblies.
Site Reliability Engineer at WRITER, ensuring 24/7 availability and performance of AI - powered workflows. Collaborating on scalable infrastructure solutions while impacting enterprise customer trust.
Engineer at Trading Technologies improving platform stability through coding and automation. Focus on building advanced monitoring tools for global trading operations.
Senior ML Ops/DevOps developing MLOps platform components at Capco Poland for financial digital transformation. Responsibilities include CI/CD, model deployment, monitoring, and team collaboration.
Senior DevOps Engineer at Verisk, focusing on AWS infrastructure and CI/CD pipeline automation. Ensuring high availability and security through collaboration with development and QA teams.
Senior DevOps & Infrastructure Engineer at IMAGO focusing on automation and infrastructure improvements. Building reliable infrastructure and leading CI/CD optimization in a dynamic environment.