Site Reliability Engineer – Data Platform at Veepee | Hybrid Hired

About the role

Site Reliability Engineer ensuring reliability and performance of data platform services for Veepee. Collaborating on cloud migration, Kubernetes operations, and observability best practices.

Responsibilities

Ensure the reliability and performance of our data platform services (Trino, Iceberg, S3, Kafka, Flink)
Define and implement SRE best practices: SLIs/SLOs, error budgets, and observability
Build and maintain monitoring, alerting, and incident response frameworks (Prometheus, Grafana, etc.)
Contribute to the migration from a public cloud data warehouse to VeepeeCloud’s lakehouse stack
Support coexistence between cloud and on-prem systems and ensure data consistency and service reliability
Help design resilient architectures for ingestion, transformation, and serving layers
Operate and improve services running on Kubernetes (GKE/EKS and on-prem clusters)
Automate infrastructure provisioning using Terraform, Atlantis, and/or Crossplane
Improve GitOps workflows for platform deployment and configuration
Collaborate with teams to optimize compute and storage usage (Trino queries, BigQuery slots, etc.)
Build tools and dashboards to track cost, usage, and efficiency
Support the transition toward cost-efficient on-prem workloads
Improve self-service capabilities for data teams (e.g., provisioning Trino/Iceberg resources)
Help teams adopt best practices in reliability, observability, and deployment
Write clear technical documentation and runbooks
Contribute to the definition and implementation of the Disaster Recovery Plan (DRP)
Ensure multi-DC resilience (FR1 / NL1) and implement data replication strategies
Participate in incident management and postmortems

Requirements

Strong experience with Kubernetes in production environments
Experience with distributed data systems (or a strong willingness to learn)
Solid understanding of SRE principles (monitoring, alerting, SLAs/SLOs)
Experience with Infrastructure as Code (Terraform or similar tools)
Familiarity with GitOps workflows
Experience with observability tools (Prometheus, Grafana, logging systems)
Comfortable working in cloud environments
Strong collaboration mindset and the ability to work across teams
Fluent in English

Benefits

Variable bonus
Dynamic and creative environment within international teams
Access to a variety of self-learning courses on our e-learning platform
Opportunity to participate in local and international meetups and conferences
Flexible office policy with up to 3 days remote work per week

Similar roles

Browse all Devops Engineer jobs

1 hour ago

RL

Vulnerability & Configuration Management Engineer

Relax Gaming Ltd

Vulnerability & Configuration Management Engineer responsible for vulnerability management and remediation processes at Relax Gaming. Collaborate with IT teams to improve security measures across various platforms.

Hybrid Role

Helsinki Finland Devops Engineer

10 hours ago

GR

Senior DevOps Engineer

greehill

DevOps Engineer for designing and maintaining Azure - based hybrid cloud infrastructure for a company specializing in nature - based smart city solutions. Leading cloud architecture and mentoring engineers as part of a high - impact team.

Hybrid Role

Budapest Hungary Devops Engineer

16 hours ago

II

Senior Infrastructure Analyst – SRE

INFOX Tecnologia da Informação

SRE responsible for ensuring reliability and performance of IT systems at a digital transformation company specializing in public sector efficiency. Collaborating on system health, incident response, and automation tasks.

Hybrid Role

São Paulo Brazil Devops Engineer

16 hours ago

BS

DevOps Senior

Beyond Soluções

DevOps Senior role at Beyond Soluções managing CI/CD for .NET and Kubernetes applications. Collaborating on cloud solutions while fostering a culture of innovation and quality.

Hybrid Role

São Paulo Brazil Devops Engineer

21 hours ago

PA

Senior Software Engineer – Cloud Infrastructure, DevOps

PayPal

Senior Software Engineer at PayPal managing cloud infrastructure and DevOps solutions. Delivering complete SDLC solutions and guiding engineering teams for scalable and reliable services.

Hybrid Role

San Jose United States Devops Engineer

$143,500 - $212,850 per year

yesterday

VS

Senior Site Reliability Engineer

VALCE Talent Solutions

Senior Site Reliability Engineer at Diligent leading reliability, automation, and observability across cloud infrastructure. Build tools for incident response and enhance performance in fast - paced environments.

Hybrid Role

Guadalajara Mexico Devops Engineer

yesterday

CI

Perception Deployment Engineer

Caterpillar Inc.

Perception Deployment Engineer deploying deep learning models on embedded systems at Caterpillar. Collaborating with cross - functional teams for integration and optimization of perception modules in vehicles.

Onsite Role

Wuxi China Devops Engineer

yesterday

AT

Principal Site Reliability Engineer, SRE

AT&T

Principal Site Reliability Engineer at AT&T required to design scalable solutions for critical operations with minimal downtime. Collaborating with teams to monitor and improve system performance in cloud environments.

Onsite Role

Plano United States Devops Engineer

$174,100 - $261,100 per year

yesterday

CO

DevOps Engineer, AI SaaS

Coach4expats

DevOps Engineer managing AI SaaS infrastructure at a high - growth European company. Supporting AI model deployment and ensuring platform security and compliance with multiple systems integration.

Hybrid Role

Europe Devops Engineer

yesterday

LE

Observability & DevOps Tools Engineering Manager

LexisNexis

Engineering Manager leading teams for observability platforms at LexisNexis. Owns operational excellence across software delivery lifecycle in Raleigh, NC.

Hybrid Role

Raleigh United States Devops Engineer

$118,300 - $219,800 per year