AI Infrastructure Engineer designing and implementing AI solutions for Xsolla's infrastructure tasks across GCP and multi-cloud environments. Collaborating with senior engineers to execute AI strategy.
Responsibilities
Design and implement AI/ML-powered solutions for infrastructure use cases, including predictive autoscaling, anomaly detection, intelligent cost optimization, and automated remediation across GCP and multi-cloud environments
Build and maintain AI-driven monitoring and observability systems that correlate logs, metrics, and traces to surface root causes, predict bottlenecks, and reduce mean time to resolution (MTTR)
Develop and operate automated incident response workflows using AI-powered playbooks that diagnose, contain, and resolve infrastructure issues with minimal manual intervention
Integrate AI tooling into CI/CD pipelines to improve deployment reliability, automate test prediction, score release health, and support rollback automation
Contribute to the development of internal AI agents and virtual assistants integrated into developer workflows (Slack, IDEs, Confluence) — enabling self-service for provisioning, troubleshooting, and infrastructure guidance
Implement AI/ML-based anomaly detection and automated vulnerability management workflows to enhance the security posture of Xsolla's infrastructure
Prototype and productionize Generative AI solutions for infrastructure automation, including auto-generation of Terraform/Puppet modules, IaC configurations, runbooks, and change documentation
Collaborate with senior engineers and leadership to evolve and execute the infrastructure AI strategy across its implementation phases
Maintain clear documentation of AI tools, integrations, and automated workflows; share knowledge and best practices across the team
Requirements
5–7 years of experience in infrastructure engineering, DevOps, SRE, or a related field
Hands-on experience with GCP (priority) and/or AWS; solid understanding of cloud resource management, scaling, and cost structures
Practical experience building or integrating AI/ML-powered tools in an operational context (anomaly detection, predictive models, LLM-based automation, or similar)
Experience with infrastructure-as-code tools — Terraform, Puppet, Ansible, or equivalent
Proficiency in Python for scripting, automation, and AI/ML integration; Bash or Go a plus
Working knowledge of Kubernetes and container orchestration in production environments
Familiarity with observability and monitoring stacks (Prometheus, Grafana, ELK, Datadog, or similar)
Familiarity with LLM APIs (OpenAI, Anthropic, or similar) and prompt engineering for operational use cases
Strong problem-solving mindset with a bias toward automation and eliminating toil
Infrastructure Systems Engineer II managing production application support for Conduent. Collaborating on ITIL processes and incident management while working in a 24/7 environment.
OT Cybersecurity Specialist responsible for secure IT - OT infrastructures in industrial operations. Engaging in secure deployments, integrating cybersecurity frameworks, and providing expert support.
Ingeniero de Infraestructura y Seguridad colaborando en el diseño de arquitecturas seguras en CRG Solutions. Integrando buenas prácticas de ciberseguridad y gestionando incidentes en entornos Windows y Linux.
Senior Infrastructure Engineer managing global IT infrastructure for aviation solutions, focusing on VMware, Nutanix, and Windows Server environments. Collaborating with teams to ensure high availability and optimal performance in a hybrid work model.
Cloud Support Engineer maintaining operational stability and automation for Azure cloud platforms. Working collaboratively across IT teams to ensure infrastructure reliability and security.
Database Engineer at Aircall building tooling for database management and observability. Working in a fast - paced environment for an innovative customer communications platform.
Lead Cloud Infrastructure Engineer at Paramount managing cloud architecture and infrastructure initiatives across environments. Involved in automation, scalability, and mentoring infrastructure engineers.
Senior Infrastructure Engineer specializing in Cisco and VMware to modernize hybrid environments for strategic partners. Ownership and mentorship role within a collaborative IT team.
Data Cloud & Infrastructure Architect connecting BigQuery potential with Salesforce execution. Mastering identity resolution and driving real - time data orchestration in a hybrid environment.
Infrastructure Engineer developing infrastructure technology for public and private cloud environments. Complying with security and operational requirements, while using automation to enhance product testing.