Senior Software Engineer focusing on complex infrastructure deployment for AION's AI cloud platform. Managing multi-cloud deployment strategies and ensuring compliance across customer environments.
Responsibilities
Design AION as a composable platform with independently deployable components that run seamlessly on AWS, GCP, Azure, and private data centers
Work with senior engineering leads to define private deployment strategies and build automation for customer VPC and on-premises installations
Build abstraction layers that unify diverse cloud providers while maintaining flexibility for customer-specific requirements
Design globally distributed deployment patterns with built-in data sovereignty, compliance, and regulatory requirements
Own end-to-end platform deployment automation using Terraform, Ansible, Helm, and infrastructure-as-code across hybrid cloud environments
Design and implement disaster recovery, failover, high-availability architectures, and cloud migration strategies for customer deployments
Build comprehensive CI/CD pipelines for infrastructure provisioning, configuration management, and deployment orchestration
Implement monitoring, observability (Prometheus, Grafana, Loki), and alerting systems tailored for customer-managed AION instances
Implement Kubernetes-based and custom orchestrator-based managed services with strict workload isolation and multi-tenancy
Design container security, runtime protection, network policies, and secrets management for production workloads
Own compliance implementation (SOC2, GDPR, HIPAA, ISO 27001, PCI-DSS) and security best practices for customer environments
Create deployment blueprints, reference architectures, self-service portals, and comprehensive documentation for customer success
Requirements
6+ years of experience in platform deployment, DevOps, SRE, or cloud infrastructure roles with focus on customer-facing deployments
Deep expertise in Kubernetes including cluster design, multi-tenancy, custom resources, operators / controllers, and production operations
Fundamental understanding of Linux processes and container internals, specifically regarding runtime optimizations like lazy loading (Nydus, eStargz) and snapshot checkpoint/restore mechanisms (CRIU) for fast migration and reduced cold-start times.
Deep understanding of computer networking and the OSI model, with experience in creating overlay networks using VXLAN or BGP and implementing network isolation through CIDRs
Strong understanding of hybrid and multi-cloud architectures combining on-premises, private, and public cloud resources, including VPCs, routing, network policies, and VPN tools like WireGuard
Proficiency in infrastructure-as-code using Terraform, Ansible, Pulumi, Nix, or CloudFormation across multiple cloud providers
Experience building and maintaining GitOps pipelines for infrastructure and application deployments using GitLab CI, GitHub Actions, ArgoCD or FluxCD
Knowledge of secrets management (External Secrets Operator, Vault, AWS Secrets Manager, GCP Secret Manager) and encryption at rest/in transit
Knowledge of observability stack including Prometheus, Grafana, Loki, distributed tracing (Jaeger, Tempo), and log aggregation
Programming/scripting skills in Go or Python for building automation tools, operators, and deployment scripts
Hands-on experience deploying complex platforms in customer VPCs and on-premises environments with strict isolation requirements
Experience designing and executing cloud migration strategies including lift-and-shift, re-platforming, and cloud-native transformations
Strong knowledge of security compliance frameworks (SOC2, GDPR, HIPAA, ISO 27001, PCI-DSS) and their implementation in cloud infrastructure
Familiarity with disaster recovery strategies, backup solutions (Velero, Kasten), and business continuity planning
Exposure to HPC systems, GPU orchestration, and AI workload patterns is highly desirable
Benefits
**Preferred Attributes:**
High ownership, self driven and a bias for action.
Strong strategic thinking and ability to connect technical decisions to business impact.
Excellent communication and mentoring skills.
Thrives in ambiguity, fast-paced environments, and early-stage startup culture.
**Why Join AION?**
Work directly with high-pedigree founders shaping technical and product strategy.
Build infrastructure powering the future of AI compute globally.
Significant ownership and impact with equity reflective of your contributions.
Competitive compensation, flexible work options, and wellness benefits
**Apply Now:**
If you’re a strong engineer ready to lead architecture and scale next-generation AI infrastructure, we want to hear from you. Please share:
Your resume highlights relevant projects and leadership experience.
Associate Software Engineer developing and integrating embedded software solutions for Boeing’s precision engagement systems. Delivering real - time applications to support defense initiatives across the globe.
Experienced Software Engineers developing and delivering complex software solutions at Boeing Precision Engagement Systems. Collaborating with a multi - discipline team in an Agile environment.
Enterprise Full - Stack Developer leading software development life - cycle for Middlebury College. Collaborating with IT and partners to enhance operational capabilities and applications.
Full Stack Developer responsible for developing innovative web and mobile applications at Interad Software GmbH. Collaborating in an agile team with a focus on frontend technologies like React and .NET.
Director of Engineering leading Digital Investing platforms, ensuring secure and resilient investment experiences. Fostering engineering excellence and collaborating across organizational structures.
Senior Software Engineer designing and coding highly efficient software applications for a leading technology company. Responsible for troubleshooting production issues and solving complex integration problems.
Lead Planner for Daimler Buses managing support for project and center steering product development. Collaborating across locations and providing financial and project guidance for engineering teams.
Junior Full - Stack Developer in a dynamic team at Pix, focusing on developing accessible public digital services. Collaborating in agile environment and using JavaScript and web frameworks.
Senior Software Engineer solving technology challenges for customers at RedCloud with skills in C#, .NET Core, and AWS. Investigating issues and delivering features in a hybrid work environment.