Foundational AI Research Scientist developing next-generation language models. Pioneering large-language-model architectures and attention mechanisms for efficient scaling.
Responsibilities
Research and prototype sub-quadratic attention architectures to unlock efficient scaling of large language models.
Design and evaluate efficient attention mechanisms including state-space models (e.g., Mamba), linear attention variants, and sparse attention patterns.
Lead pre-training initiatives across a range of model scales from 1B to 100B+ parameters.
Conduct rigorous experiments measuring the efficiency, performance, and scaling characteristics of novel architectures.
Collaborate closely with product and engineering teams to integrate models into production systems.
Stay at the forefront of foundational research and help shape Aldea's long-term model roadmap.
Requirements
Requires a Ph.D. in Computer Science, Engineering, or related field.
3+ years of relevant industry experience.
Deep understanding of modern sequence modeling architectures including State Space Models (SSMs), Sparse Attention mechanisms, Mixture of Experts (MoE), and Linear Attention variants.
Hands-on experience pre-training large language models across a range of scales (1B+ parameters).
Expertise in PyTorch, Transformers, and large-scale deep-learning frameworks.
Proven ability to design and evaluate complex research experiments.
Demonstrated research impact through patents, deployed systems, or core-model contributions.
Nice to Have Experience with distributed training frameworks and multi-node optimization.
Knowledge of GPU acceleration, CUDA kernels, or Triton optimization.
Publication record in top-tier ML venues (NeurIPS, ICML, ICLR) focused on architecture research.
Experience with model scaling laws and efficiency-performance tradeoffs.
Background in hybrid architectures combining attention with alternative sequence modeling approaches.
Familiarity with training stability techniques for large-scale pre-training runs.
Benefits
Competitive base salary
Performance-based bonus aligned with research and model milestones
Data Scientist collaborating with teams to optimize machine learning for advertising. Building and deploying models for campaign performance across e - commerce and social commerce platforms.
Postdoctoral Research Fellow in quantum technologies at MERL, shaping research agendas and publishing high - impact articles. Collaborative role with potential transition to full - time research staff.
Research Scientist specializing in optimization algorithms for electric grids, drones, transportation, and supply chain management. Aiming to impact the scientific community and Mitsubishi Electric's future technology.
Postdoctoral Research Fellow focusing on agentic AI systems at Mitsubishi Electric Research Laboratories. Engaging in independent research and collaborating with teams to publish findings.
Research Fellow in edge AI acceleration focusing on energy - efficient circuit design at Singapore University of Technology and Design. Engaging in software - hardware co - optimization and graph - based prediction applications.
Developing nuclear weapons effects simulations and modeling tools at Leidos. Engaging in radiation hydrodynamics with opportunities for telework and collaboration with senior staff.
Senior Principal Scientist / Associate Director at Novartis involved in developing and implementing translational pharmacology strategies. Driving interdisciplinary collaboration to support transformative new medicines.
PK Sciences project team representative developing pharmacology strategies for transformative new medicines. Collaborates in clinical development, providing insights on drug candidates and dosing strategies.
Research Assistant conducting experiments and supporting innovation at Meissner in scientific research. Collaborating with R&D teams and maintaining meticulous records of experimental data.
Senior Principal Scientist focusing on cardiovascular translational development and late - stage clinical research at Bristol Myers Squibb. Integrates laboratory science and project management to maximize drug development potential.