Architecting AI platform capabilities for Salesforce's AgentForce team. Driving technical vision and system design for deploying large-scale ML models.
Responsibilities
Define the end-to-end architecture for AgentForce’s model serving, inference orchestration, and agentic reasoning loops.
Make high-stakes technical decisions regarding "build vs. buy," model sizing, context window management, and retrieval-augmented generation (RAG) strategies.
Architect scalable pipelines for continuous learning (RLHF/RLAIF) that integrate seamlessly with production traffic without compromising latency or stability.
Design systems for multi-turn agent state management, memory persistence, and tool invocation (function calling).
Own the end-to-end architectural design of AgentForce AI capabilities from product requirements through model design, system implementation, and production rollout.
Translate product use cases (e.g., agent experiences, workflows, UI features) into concrete system architectures, including APIs, service contracts, and model interaction patterns.
Define reference architectures for AI-powered applications (web, backend services, agent runtimes) that standardize how products integrate with AgentForce models.
Translate abstract research concepts into concrete engineering specifications.
Collaborate with scientists to optimize models for deployment (quantization, distillation, pruning) without sacrificing reasoning capabilities.
Mentor Principal Scientists and Staff Engineers on system design principles and architectural patterns.
Requirements
PhD or Master’s in Computer Science, AI, Machine Learning, or Distributed Systems
10+ years of technical experience, with a specific focus on deploying ML models at scale
Proven experience acting as an Architect or Principal-level technical lead for large-scale AI or data platforms
Experience designing and building production-grade AI-powered applications or platforms
Experience defining public/internal APIs, SDKs, and service interfaces for ML/AI capabilities consumed by product teams
Familiarity with frontend–backend–model interaction patterns for low-latency user-facing AI experiences
Profound understanding of Transformer architectures, attention mechanisms, and the math behind LLMs (not just API usage)
Experience with high-performance inference serving (e.g., vLLM, TensorRT-LLM, TGI, Triton) and optimization techniques (quantization, LoRA adapters, paged attention)
Strong background in designing distributed systems, microservices, and event-driven architectures (Kafka, gRPC, Kubernetes)
Advanced proficiency in Python and familiarity with C++ or CUDA is a strong plus.
Platform Architect responsible for performance analysis of Cloud Gaming Hardware for NVIDIA's GeForce NOW. Engaging with design reviews and improving gaming performance across software stacks.
Global Enterprise Cybersecurity Architect managing security architecture strategies at Fiserv. Leading security design implementation across various environments including cloud and on - premises.
Senior Hardware Architect defining the architecture for Tegra System - on - Chips at NVIDIA. Leading diagnostics development for hardware failures in datacenters and autonomous vehicles.
Project Designer working on diverse projects with strong emphasis on design and detail at an award - winning architecture studio. Collaborating on core creative processes and delivering holistic design solutions.
Chief EO/IR Architect overseeing all phases of system development for EO/IR systems at CACI. Leading technical direction and project teams in various engineering disciplines.
Senior netarchitect at Alliander managing customer integration on electricity networks and addressing congestion issues. Focusing on scalable solutions and stakeholder collaboration in the energy sector.
Director of Licensing and Contracting at GE HealthCare responsible for licensing and contract standards in software solutions. Driving collaboration with product, legal, and finance teams for efficient commercial outcomes.
CPU Power Management Architect at Intel Corporation shaping CPU designs for high - performance computing. Driving end - to - end CPU power management architecture specifications and collaborating across teams.