Site Reliability Engineer – SRE at Baseten | Hybrid Hired

About the role

Site Reliability Engineer ensuring scalable infrastructure in AI product deployment for top AI companies. Involves building automated processes and collaborating across teams.

Responsibilities

Build and maintain scalable infrastructure to support the deployment and operation of machine learning models.
Establish standards and best practices for reliability and performance across the infrastructure.
Automate processes when relevant, particularly for managing CI/CD pipelines.
Own products and projects end-to-end, functioning as both an engineer and a project manager, with a focus on user empathy, project specification, and end-to-end execution.
Collaborate with cross-functional teams to understand project requirements and translate them into technical solutions.
Mentor junior team members and contribute to knowledge sharing within the organization.
Navigate ambiguity and exercise good judgment on tradeoffs and tools needed to solve problems, avoiding unnecessary complexity.
Demonstrate pride, ownership, and accountability for your work, expecting the same from your teammates.

Requirements

Bachelor's, Master's, or Ph.D. degree in Computer Science, Engineering, Mathematics, or related field.
Extensive experience with Kubernetes.
Experience in building and maintaining scalable infrastructure.
Experience with infrastructure-as-code tools (e.g., Terraform, CloudFormation, Pulumi) and CI/CD tooling (e.g., GitHub Actions, GitLab CI, Circle CI, Jenkins).
Relevant OSS observability experience (Prometheus, ELK stack, Grafana stack, Opentelemetry) is a plus.
Ability to own projects end-to-end, from project specification to execution.
No prior machine learning experience required, but should be open to learning about it.

Benefits

Competitive compensation, including meaningful equity.
100% coverage of medical, dental, and vision insurance for employee and dependents
Generous PTO policy including company wide Winter Break (our offices are closed from Christmas Eve to New Year's Day!)
Paid parental leave
Company-facilitated 401(k)
Exposure to a variety of ML startups, offering unparalleled learning and networking opportunities.

Similar roles

Browse all Devops Engineer jobs

21 minutes ago

DL

Site Reliability Engineer, Technical Referent

dLocal

Site Reliability Engineer focused on designing and maintaining observability platform for dLocal. Collaborating with global teams and optimizing system performance for major clients.

Hybrid Role

Madrid Spain Devops Engineer

1 hour ago

CU

Staff Site Reliability Engineer

Civica US

Staff Site Reliability Engineer focused on product engineering for Civica. Leading technical practices and architectural alignment while improving service delivery and quality.

Hybrid Role

Melbourne Australia Devops Engineer

2 hours ago

CE

Senior Cloud Operations Engineer

CELUM

Senior Cloud Operations Engineer at CELUM focusing on cloud infrastructure and system security. Collaborating on IT projects and optimizing hosting environments.

Hybrid Role

Vienna Austria Devops Engineer

€70,000 per year

3 hours ago

FO

DevOps Engineer

FormativGroup

DevOps Engineer at FormativGroup focusing on Kubernetes management and automation solutions. Designing, implementing, and securing infrastructure for efficient application deployment in a remote setting.

Hybrid Role

United States Devops Engineer

$114,000 - $138,000 per year

4 hours ago

EM

Senior AWS Cloud Engineer – DevOps

Emergn

Senior AWS Cloud Engineer designing and building cloud infrastructure at Emergn. Collaborating with global teams to enhance scalable and reliable delivery of products.

Hybrid Role

Portugal Devops Engineer

5 hours ago

BU

SRE/DevOps Engineer

Bumper

SRE/DevOps Engineer improving platform reliability for multi - award - winning digital payments platform. Working from UK offices and collaborating with engineers to build a developer - friendly platform.

Hybrid Role

London United Kingdom Devops Engineer

5 hours ago

PI

Senior SRE

Pigment

Senior SRE designing and implementing infrastructure to support real - time data processing for Pigment's AI - powered business planning. Collaborating closely with software engineers and taking ownership of performance challenges.

Hybrid Role

Paris France Devops Engineer

€75,000 - €130,000 per year

6 hours ago

BR

DevOps Engineer

Bromcom

DevOps Engineer responsible for Azure infrastructure development and optimization at Bromcom. Ensuring stability, security, and scalability of the cloud platform with CI/CD automation and monitoring.

Hybrid Role

Bromley United Kingdom Devops Engineer

6 hours ago

RE

Lead DevOps Engineer

RebelDot

DevOps Engineer developing and maintaining CI/CD pipelines using Azure DevOps at RebelDot. Collaborating with teams on cloud and hybrid deployments in Romania.

Hybrid Role

Cluj-Napoca Romania Devops Engineer

9 hours ago

HA

Staff Software Engineer, Site Reliability (SRE)

Harvey

Staff Software Engineer joining Site Reliability team ensuring performance and reliability of legal AI platform. Designing monitoring and alerting systems while managing operations across global regions.

Hybrid Role

Bengaluru India Devops Engineer