Forward Deployed SRE at Baseten | Hybrid Hired

About the role

Primary post-sales technical owner ensuring reliability of ML workloads for strategic customers at AI company. Collaborating with teams to drive technical success and product improvements.

Responsibilities

Diagnose and resolve runtime issues related to latency, memory behavior, GPU utilization, concurrency, and model lifecycle management.
Debug infrastructure issues across Kubernetes (pods, controllers), networking, observability, and alerting systems.
Lead incident response during outages or escalations, managing coordination between Product, FDE, Sales, and Engineering.
Serve as the technical owner for top enterprise accounts with strict SLAs and high responsiveness expectations.
Identify common failure modes and translate user feedback into roadmap signals, product improvements, our internal runbooks, knowledge bases, and diagnostic best practices.
Own project coordination end-to-end: scoping, execution, communication, and stakeholder alignment across technical and non-technical teams ranging from feature requests, new deployments, and operational debugging issues.

Requirements

Deep Kubernetes troubleshooting expertise, including advanced resource debugging, pod/runtime analysis, and log-based diagnostics using observability tooling such as Grafana, Loki, and Prometheus.
Strong infrastructure debugging ability across container orchestration, networking, and service dependencies, with hands-on experience supporting production-grade clusters.
Experience managing high-severity incidents with major customers, including SLAs, post-incident reviews, and clear communication throughout escalations.
Proven project management and organizational skills with an ownership mindset, able to manage multiple complex, multi-stakeholder initiatives in parallel — including issue resolution, root-cause analysis, and feature delivery.
Ability to translate recurring technical pain points into roadmap-level insights, documentation improvements, or product enhancements.
Strong communication skills and executive presence during high-visibility situations, ensuring technical clarity and customer confidence.
3+ years of experience in a fast-paced, high-growth, or customer-facing engineering environment.

Benefits

Competitive compensation, including meaningful equity.
100% coverage of medical, dental, and vision insurance for employee and dependents
Generous PTO policy including company wide Winter Break (our offices are closed from Christmas Eve to New Year's Day!)
Paid parental leave
Company-facilitated 401(k)
Exposure to a variety of ML startups, offering unparalleled learning and networking opportunities.

Similar roles

Browse all Devops Engineer jobs

7 minutes ago

SC

Reliability Engineer

SES Corporation

Reliability Engineer responsible for availability and performance of U.S. Air Force Cloud services. Collaborates with teams to deliver reliable mission - critical systems in a hybrid environment.

Hybrid Role

Hanscom Air Force Base United States Devops Engineer

15 minutes ago

AN

Graduate DevOps Engineer

Anomali

Entry - level DevOps Engineer assisting in cloud infrastructure automation for AI - powered security operations platform. Seeking passionate candidates with foundational knowledge in Terraform, Kubernetes, and CI/CD pipelines.

Hybrid Role

Redwood City United States Devops Engineer

$100,000 - $150,000 per year

1 hour ago

SL

DevSecOps Engineer

SQA Consulting Limited

DevSecOps Engineer responsible for security in CI/CD pipelines for a global client network. Collaborating on security hardening of applications and automation processes.

Hybrid Role

Tuzla Bosnia And Herzegovina Devops Engineer

1 hour ago

SL

DevSecOps Engineer

SQA Consulting Limited

DevSecOps Engineer maintaining CI/CD security pipelines at SQA Consulting. Collaborating with teams to automate processes and ensure security best practices are followed.

Hybrid Role

Zagreb Croatia Devops Engineer

1 hour ago

SL

DevSecOps Engineer

SQA Consulting Limited

DevSecOps Engineer for SQA Consulting focusing on CI/CD automation and security hardening. Collaborating with teams on cloud solutions in a hybrid work environment.

Hybrid Role

Belgrade Serbia Devops Engineer

1 hour ago

SL

DevSecOps Engineer

SQA Consulting Limited

DevSecOps Engineer managing CI/CD pipelines and ensuring application security for SQA Consulting. Collaborating across teams while focusing on continuous improvement and automation in cloud environments.

Hybrid Role

Prishtine Kosovo Devops Engineer

1 hour ago

DL

Site Reliability Engineer, Technical Referent

dLocal

Site Reliability Engineer focused on designing and maintaining observability platform for dLocal. Collaborating with global teams and optimizing system performance for major clients.

Hybrid Role

Madrid Spain Devops Engineer

2 hours ago

CU

Staff Site Reliability Engineer

Civica US

Staff Site Reliability Engineer focused on product engineering for Civica. Leading technical practices and architectural alignment while improving service delivery and quality.

Hybrid Role

Melbourne Australia Devops Engineer

3 hours ago

CE

Senior Cloud Operations Engineer

CELUM

Senior Cloud Operations Engineer at CELUM focusing on cloud infrastructure and system security. Collaborating on IT projects and optimizing hosting environments.

Hybrid Role

Vienna Austria Devops Engineer

€70,000 per year

5 hours ago

FO

DevOps Engineer

FormativGroup

DevOps Engineer at FormativGroup focusing on Kubernetes management and automation solutions. Designing, implementing, and securing infrastructure for efficient application deployment in a remote setting.

Hybrid Role

United States Devops Engineer

$114,000 - $138,000 per year