Principal Software Engineer – AI Inference at NVIDIA | Hybrid Hired

About the role

Principal Software Engineer focusing on AI Inference, optimizing GPU performance and contributing to open-source projects. Collaborative role improving inference systems like vLLM and SGLang at NVIDIA.

Responsibilities

Drive upstream-first engineering in vLLM/SGLang: author and land PRs or equivalent experience, engage in development discussions, help compose roadmaps, and build durable maintainer relationships.
Build and implement inference-runtime features that improve efficiency, latency, and tail behavior: request scheduling, batching policies, KV-cache management (paging/sharding), memory planning, and streaming.
Optimize core hot paths across the stack—from Python orchestration down to C++/CUDA kernels—using profiling and measurement to guide decisions.
Improve multi-GPU and multi-node inference: communication patterns, parallelism strategies (tensor/sequence/pipeline), and system-level scaling/efficiency.
Strengthen correctness, robustness, and operability: determinism where needed, graceful degradation, backpressure, observability hooks, and performance regression testing.
Collaborate across NVIDIA to integrate upstream advances with production needs (deployment patterns, compatibility, security posture) while keeping changes broadly adoptable by the community.
Mentor senior engineers, raise the technical bar through build reviews, and establish guidelines for performance engineering and upstream contribution workflows.

Requirements

15+ years building production software with significant depth in systems engineering
strong track record of owning ambiguous, high-impact technical problems end-to-end
demonstrated expertise in LLM inference/serving systems (e.g., vLLM, SGLang) and the tradeoffs that drive real production performance
strong programming skills in Rust, C++, Python, CUDA; ability to read, modify, and optimize performance-critical code across layers
experience with GPU performance analysis tools and methodologies (profiling, microbenchmarking, memory/comms analysis) and a strong measurement culture
solid foundation in distributed systems and concurrency: queues/schedulers, RPC/streaming, multi-process/multi-threaded runtime behavior, and scaling patterns across nodes
excellent communication skills; ability to influence across teams and represent NVIDIA well in open-source technical forums
BS/MS in Computer Science, Computer Engineering, or related field (or equivalent experience)

Benefits

equity
benefits

Similar roles

Browse all Full Stack Engineer jobs

8 minutes ago

CD

Principal Engineer

CDP

Principal Engineer responsible for enhancing service integrations at CDP Global, focusing on environmental impact. Collaborate with tech leads to align on integration standards and document architecture.

Hybrid Role

London United Kingdom Full Stack Engineer

£78,969 - £98,711 per year

13 minutes ago

AD

Software Development Engineer

Adobe

Software Development Engineer creating innovative features for Adobe Experience Manager product. Collaborating with global brands and applying AI experimentation in a creative software development role.

Onsite Role

San Francisco United States Full Stack Engineer

$114,100 - $214,950 per year

43 minutes ago

MU

Software Engineer, Fullstack Developer – AVP

MUFG

Fullstack Developer at MUFG, collaborating with senior technical teams to create innovative solutions. Responsible for application design, programming tasks, and deployments in a cloud environment.

Hybrid Role

United States Full Stack Engineer

$121,000 - $146,000 per year

1 hour ago

XE

Software Engineer

Xephyr

Software Engineer developing and deploying AI and data solutions for enterprise environments. Collaborating with teams and guiding best practices.

Hybrid Role

Sydney Australia Full Stack Engineer

1 hour ago

KI

Senior R&D Technical Leader

Kimberly-Clark

Senior R&D Technical Leader partnering with marketing to drive adult and fem care innovation at Kimberly - Clark. Leading projects and aligning teams for enhanced product development and execution.

Hybrid Role

Prindisa Costa Rica Full Stack Engineer

2 hours ago

RE

Senior Software Engineer

RevGenius

Senior Software Engineer developing scalable and high - performing applications for Rev's SaaS platform. Collaborating with cross - functional teams and mentoring junior developers with modern technologies.

Hybrid Role

Austin United States Full Stack Engineer

2 hours ago

LA

Senior Software Engineer – IAM

Lambda

Senior Software Engineer building and scaling Lambda’s IAM platform enabling secure access control. Designing core IAM capabilities and collaborating with cross - functional teams.

Hybrid Role

San Francisco United States Full Stack Engineer

$296,000 - $445,000 per year

2 hours ago

BR

AI Software Engineer

Broadcom

AI Software Engineer integrating commercial AI tools and agents into design flow at Broadcom. Responsible for optimizing performance and coordinating AI systems within a worldwide R&D team.

Onsite Role

Fort Collins United States Full Stack Engineer

$108,000 - $172,800 per year

2 hours ago

RE

Software Engineer, Mobile

Replit

Join Replit as a Mobile Developer working on a React Native app. Collaborate with designers and engineers to deliver innovative mobile features.

Hybrid Role

Foster City United States Full Stack Engineer

$130,000 - $300,000 per year

4 hours ago

WA

Principal Software Engineer

Walmart

Principal Software Engineer developing scalable backend systems for Walmart's Digital Out of Home platform. Leading architecture, mentoring engineers, and guiding technical direction across thousands of retail locations.

Hybrid Role

Sunnyvale United States Full Stack Engineer

$143,000 - $286,000 per year