Site Reliability Engineer at MetroStar providing technical solutions for the federal government. Collaborating with teams on application workloads and operational observability.
Responsibilities
Design, deploy, and maintain mission-critical application workloads on virtualized or containerized environments (e.g., VMWare or Kubernetes), ensuring scalability, availability, and compliance with government requirements.
Develop and sustain automated CI/CD pipelines, monitoring, and configuration management workflows to support reliable software delivery and operational observability across development, integration, staging, and production environments.
Provision, configure, and maintain developer environments and toolchains to support rapid, secure, and efficient development workflows, enabling mission-aligned software delivery.
Identify developer friction across the software development lifecycle and implement solutions to reduce that friction and provide developer-first environments.
Establish and maintain a high level of customer trust and confidence through deep technical expertise, and use creativity to provide innovative solutions that fit the customer’s mission needs.
Requirements
Active Top Secret with current or previously held SCI access.
Certification meeting DoD 8140 (e.g., Security+, or higher).
Bachelor’s degree in Computer Science or related engineering field is preferred; relevant experience may substitute.
7+ years of experience in software development, systems engineering, or operations roles with responsibility for availability, performance, and reliability of production systems.
Demonstrated experience blending software engineering and systems administration practices to support highly available, scalable applications.
Experience designing and managing monitoring, alerting, and observability solutions to meet defined Service Level Objectives.
Experience leading or participating in incident response, root cause analysis, and continuous improvement activities.
Experience with Ansible and Desired State Configuration.
Experience with GitLab CI/CD automation and Bash scripting.
Experience supporting container-native storage and object storage solutions (e.g., MinIO, S3-compatible services, and PortWorx).
Experience with enterprise load-balancing solutions (e.g., F5 or similar platforms).
Ability to contribute immediately with minimal ramp-up in a mission-critical operational environment.
Site Reliability Engineer at bsport scaling infrastructure and streamlining deployment processes. Responsible for managing reliability and CI/CD pipelines in a hybrid work environment.
Senior DevOps/Infra Engineer collaborating with top digital entertainment companies on impactful projects. Offering a blend of freelance flexibility and traditional employment security in Stockholm.
Senior Database Reliability Engineer enhancing MongoDB and PostgreSQL deployments at SS&C, a leader in financial services technology. Collaborating with teams to ensure operational reliability and mentor junior engineers.
DevOps Engineer at Smile enhancing performance and security for digital transformation projects. Collaborating on end - to - end solutions and driving operational efficiency in a digital environment.
DevOps Engineer managing automation lifecycle and technical infrastructure support for gaming company. Collaborating with IT Operations and business units to streamline issue resolution and enhance service quality.
DevSecOps Engineer responsible for CI/CD pipeline design, infrastructure automation, and ensuring operational reliability in a fast - growing AI startup.
DevOps Engineer defining DevOps strategies and collaborating with teams at Pacific Programming and Tech. Building infrastructure and processes for software solutions in a hybrid environment.
Senior DevOps Engineer managing Azure cloud infrastructure for AI solutions in healthcare. Architecting and maintaining multi - tenant Azure environments while ensuring compliance and security.
Senior DevOps Engineer at Leidos contributing to mission - critical programs for national security. Focusing on platform architecture, automation, and cloud infrastructure solutions.
DevSecOps Engineer modernizing multi - cloud environments for Leidos. Collaborating across AWS, Azure, Google, and Oracle clouds to support mission - critical systems.