Software Engineer focusing on ML infrastructure for drug discovery at Genesis AI. Leading engineering efforts to enhance scalable platforms for generative modeling and large-scale simulations.
Responsibilities
Lead engineering efforts focused on continuous improvement of the AI platform, focused on rapid build out and iteration on scalable and robust distributed infrastructure for ML training, inference, and evaluation.
Support model training and deployment across multiple clusters and multiple clouds, optimizing for throughput and cost.
Optimizing efficiency of ML models and other workloads in terms of latency, throughput, memory consumption, etc. (e.g., via GPU performance engineering), pushing the limits of what’s possible with the current hardware.
Contribute to the long-term vision for Genesis’ infra platform.
Requirements
Strong engineer who constantly strives for technical excellence. You can write clean code and have a deep understanding of the codebases you work in.
Deeply experienced with distributed training and inference of large models on GPU clusters and some of the core libraries and frameworks we use: Pytorch, Pytorch Lightning, Pytorch Geometric, and Ray.
Independent thinker with a strong sense of ownership and capability of engineering robust systems from first-principles-based conceptualization to state-of-the-art realization.
Curious, problem-oriented thinker who is excited to dive deep into the emerging field at the intersection of AI, physics, chemistry, and biology and make foundational contributions and discoveries (no previous experience in anything but ML necessary).
Experienced with building, maintaining and debugging low-level cluster infrastructure running on multiple clouds using Kubernetes and Terraform.
Experienced GPU engineer who can quickly figure out performance bottlenecks and architect highly performant code for large scale ML workloads.
Experience with XLA, Triton, CUDA, or similar accelerator programming languages and/or deep learning compiler stacks.
Experience working with some of the following: molecular systems (protein sequences and 3D structures, small molecules, etc.), ML force fields or other physics-informed models and methods, or point cloud data in other application domains, such as 3D graphics.
Benefits
Competitive compensation package that includes salary and equity.
Comprehensive health benefits: Medical, Dental, and Vision (covered 100% for the employees).
401(k) plan.
Open (unlimited) PTO policy.
Free lunches and dinners at our offices.
Paid family leave (maternity and paternity).
Life and long- and short-term disability insurance.
Senior Staff Machine Learning Engineer leading technical architecture for GEICO's AI Agent Platform. Driving innovation and enhancing productivity for internal associates and customers.
Staff Machine Learning Engineer developing the next generation of AI Agent OS and SDKs for GEICO. Key responsibilities include architecting scalable systems and implementing observability frameworks.
Senior Machine Learning Engineer at Bumble developing scalable AI systems for personalized user interactions. Leading machine learning model development and deployment from exploration to production.
Lead Machine Learning Engineer at Bumble shaping user connections through machine learning. Driving end - to - end AI solutions while mentoring engineers in a hybrid work environment.
Designing and operating cloud - based MLOps capabilities supporting analytical and generative AI models. Collaborating with data science and business teams for high - impact AI solutions.
Machine Learning Engineer analyzing data structures and developing ML models for customer profiling in Azerbaijan. Collaborating on probabilistic modeling and data quality improvement.
Machine Learning Engineer at HackerRank working on integrity systems to improve model quality. Collaborating on strategies for new signals like audio analysis and behavioral anomalies.
Machine Learning Engineer developing integrity systems for assessing model quality at HackerRank. Collaborating on multimodal signal processing and improving model performance.
Architect designing enterprise - grade AI/ML architectures for Quantiphi. Leading AI applications and ML strategy with a focus on scalability, security, and integration.
Software Engineer for ML Infrastructure at Slack, architecting systems to support large scale AI deployment and reliability. Engage in deep systems engineering focusing on ML lifecycle and infrastructure scalability.