Manager of Machine Learning Operations at Zefr, leading a team to build scalable ML infrastructure and optimize ML model performance. Collaborate closely with ML Engineers and Data Scientists for robust pipeline management.
Responsibilities
Lead, mentor, and grow a team of Machine Learning Engineers, fostering a culture of innovation and continuous improvement
Design and implement scalable ML infrastructure for model training, deployment, and serving
Establish and enforce best practices for ML model lifecycle management, including versioning, testing, and monitoring
Develop and maintain CI/CD pipelines for machine learning workflows
Optimize model inference performance and reduce latency/cost across production systems
Collaborate with ML Engineers and Data Scientists to productionize models efficiently
Implement robust monitoring, alerting, and observability solutions for ML systems
Drive technical decisions on ML Ops tooling, infrastructure, and architecture
Ensure high availability and reliability of ML services at scale
Manage project timelines, priorities, and resource allocation for the ML Ops team
Requirements
Bachelor's or Master's degree in Computer Science or related field with 5+ years of professional experience in ML Engineering or MLOps
2+ years of experience managing or leading engineering teams
Deep expertise in ML model deployment, serving infrastructure, and production ML systems
Hands-on experience with transformer architectures (e.g., BERT, ViT) for natural language and vision tasks.
Strong understanding of multimodal embedding techniques for integrating text, image, audio, and structured data.
Experience with LLM models such as Gemini, GPT, Claude, Qwen, etc.
Experience with ML experiment tracking, model versioning, and feature stores
Strong understanding of CI/CD principles applied to ML workflows
Experience optimizing model inference performance (ONNX, TensorRT, or similar)
Excellent leadership, communication, and stakeholder management skills
Track record of building and scaling high-performing engineering teams
Openness to new technologies and creative solutions
Benefits
Flexible PTO
Medical, dental, and vision insurance with FSA options
Company-paid life insurance
Paid parental leave
401(k) with company match
Professional development opportunities
14 paid holidays off
Flexible hybrid work schedule
"Summer Fridays" (shorter work days on select Fridays during the summertime)
In-office lunches and lots of free food
Optional in-person and virtual events (we like to celebrate!)
Principal Machine Learning Engineer driving technical direction for core Snap products. Set vision and develop ML technology for millions of Snapchatters.
Senior Machine Learning Engineer designing and developing generative AI systems for Adobe Firefly Services. Collaborating with other engineering teams on innovative solutions and optimizing performance in large - scale environments.
Senior Machine Learning Engineer developing AI solutions for IDEXX's Data & AI Center of Excellence. Designing and implementing machine learning models and systems while collaborating with cross - functional teams.
Product - minded Machine Learning Engineer at Arcade crafting generative models and solutions. Collaborating across design, engineering, and product to create innovative content features.
Senior Software Engineer innovating with Generative AI solutions for cutting - edge network management. Oversee software development lifecycle and collaborate with cross - functional teams.
AI/ML Engineer applying AI/ML techniques in hardware manufacturing for yield prediction and process improvement. Collaborating on research and deployment of machine learning models.
Senior Machine Learning Engineer developing advanced ML and NLP solutions for Forrester’s conversational AI chatbot. Collaborating with cross - functional teams to deliver scalable, production - ready ML systems.
Machine Learning Engineer designing GPU computing kernels to optimize 3D GenAI models at Meshy. Collaborating with researchers to enhance performance and efficiency in GPU module development.
Senior Software Engineer developing scalable machine learning solutions for product - driven team at Maropost. Collaborating on recommendation systems and enhancing developer experience within the Machine Learning team.
Principal MLOps Engineer leading design and optimization of machine learning infrastructure at Wood Mackenzie. Collaborating with data science and engineering teams to ensure robust automated ML lifecycles.