AI Inference Engineer developing AI model optimizations for Quadric's GPNPU platforms. Porting and benchmarking AI models to enhance performance in edge devices.
Responsibilities
Quantize, prune and convert models for deployment
Port models to Quadric platform using Quadric toolchain
Optimize inference deployment for latency, speed
Benchmark and profile model performance and accuracy
Develop tools to scale and speed up the deployment
Make Improvement to SDK and runtime
Provide technical support and documents to customers and developer community
Requirements
Bachelor’s or Master’s in Computer Science and/or Electric Engineering.
5+ years of experience in AI/LLM model inference and deployment frameworks/tools
experience with model quantization (PTQ, QAT) and tools
experience with model accuracy measures
experience with model inference performance profiling
experience with at least one of the following frameworks: onnxruntime, Pytorch, vLLM, huggingface-transformer, neural-compressor, llamacpp
Proficiency in C/C++ and Python
Demonstrate good capability in problem solving, debug and communication
AI Partner Technical Engineer guiding partners in Dell’s AI Technical Partner Program. Ensuring seamless technical journeys and optimizing AI solutions compliance with Dell’s criteria.
Associate Director leading medical engagement transformation for Johnson & Johnson Oncology portfolio. Designing and delivering analytics solutions for enhanced customer engagement in healthcare.
AI Change Enablement Consultant enabling clients through design - led thinking and practical enablement. Role involves collaboration with architects and stakeholders to drive measurable outcomes.
Incident & Problem Manager responsible for critical incident resolution and AI evolution in incident management at Omnissa. Leading operational excellence in a global IT operations organization.
Intern in AI Innovation at Founders Bay Accelerator, supporting startups with AI applications and workflows. Collaborate closely with founding team and engage in hands - on AI projects.
Design and deploy intelligent automation and agentic AI solutions for clients in India. Collaborate on AI strategies and evangelize best practices in the fusion of AI and business.
Digital Transformation & AI Specialist designing digital solutions for a global flower industry company. Enhancing operational efficiency and supporting technological evolution through AI and automation processes.
Strategic Growth Partner shaping and scaling Data & AI footprint in Financial Services. Joining Keyrus, a global consultancy, you'll advise on data transformation agendas.
AI/ML Research and Development Intern at the Applied Research Laboratory, focusing on machine learning and AI technologies. Collaborating with teams on algorithmic solutions and software development.
AI Automation Analyst at Antares responsible for AI initiatives and PMO support. Collaborating on intelligent workflows and automation projects within technology operations.