Staff Software Engineer optimizing computational cloud infrastructure for R&D teams at Pfizer. Leading strategy and stakeholder engagement for scientific workloads migration and resource management.
Responsibilities
Lead solution design and migration strategy for R&D teams transitioning legacy scientific workloads to cloud based HPC platforms, ensuring alignment with performance, scalability, security, and cost objectives.
Partner with scientific stakeholders to translate research needs into platform level infrastructure requirements, providing senior technical guidance on compute, storage, and parallelization approaches.
Serve as the senior technical authority for complex HPC operational issues, defining troubleshooting frameworks, escalation paths, and long term remediation strategies for scheduler, dependency, and workflow failures.
Own the strategy, quality, and governance of HPC documentation and knowledge assets, ensuring documentation remains accurate, accessible, and aligned with platform standards, onboarding needs, and evolving best practices.
Lead platform level communications and stakeholder engagement related to HPC operations, including maintenance, capacity changes, and upgrades, ensuring transparency, predictability, and minimal disruption to scientific workloads.
Define and oversee user enablement and training strategy for HPC platforms, ensuring researchers are equipped to use cloud resources efficiently, responsibly, and in accordance with platform best practices.
Own the end to end lifecycle strategy for scientific software platforms, including selection, deployment models, upgrade planning, and deprecation, to ensure reliability, reproducibility, and broad usability across research domains.
Establish containerization standards and adoption models for scientific workflows, overseeing the transition of complex applications to container based execution environments and ensuring consistency across teams and platforms.
Set and govern application performance optimization standards across cloud instance types, guiding workload placement decisions to maximize performance, scalability, and cost efficiency.
Requirements
B.S. with 7+ years or Ph.D. with 3+ years of experience in high performance computing, cloud computing, and life sciences.
Deep Linux systems expertise supporting the design, standardization, automation, and reliable operation of scientific computing platforms and services.
Excellent written and verbal communication skills with the ability to clearly communicate complex technical concepts to scientific, technical, and platform stakeholders.
Demonstrated ability to lead resolution of complex technical issues while providing clear status updates and communication back to scientific stakeholders.
Deep foundational technical expertise across HPC and cloud platforms, including Linux, Slurm, Kubernetes, GitOps, Google Cloud Platform, Spack, Cluster Toolkit, infrastructure-as-code-tooling, GPU architectures, and related scientific software ecosystems.
Advanced experience with at least one of AWS and GCP, including knowledge of core compute and storage services relevant to HPC.
Experience designing, operating, or supporting distributed computing environments, including Kubernetes-based environments such as GKE.
Prior experience with HPC deployment utilities including Google Cluster Toolkit
Benefits
401(k) plan with Pfizer Matching Contributions and an additional Pfizer Retirement Savings Contribution
paid vacation
holiday and personal days
paid caregiver/parental and medical leave
health benefits to include medical, prescription drug, dental and vision coverage
Junior Software Developer contributing to the design and implementation of insurance software solutions with a focus on the Spring framework at Fadata.
Fullstack Developer in e/m Commerce team creating innovative banking applications for over 30 million users. Working with cross - functional teams on digital solutions from idea to production.
React Native Engineer developing innovative mobile applications for fintech solutions while collaborating across teams. Driving success through code quality and adherence to best practices.
DevOps Engineer Lead managing AWS infrastructure for D1C commercial banking platform. Responsible for automation and maintaining availability through CI/CD and on - call support.
Principal Engineer designing technology solutions focusing on AI - enabled workflows at Liberty Blume. Join a fast - growing company to solve complex technical challenges with a hybrid work model.
Full - stack Engineer developing user - facing product experiences for Waitwhile's wait management platform. Collaborating with cross - functional teams and contributing to scalable features in a hybrid work environment.
Senior Full Stack Engineer developing scalable solutions for Arrive's B2B web team supporting 80 million users. Collaborating with distributed teams to modernize legacy systems and enhance user experience.
Fullstack Team Lead at Mate academy, shaping student acquisition through development and marketing strategies. Conceptualizing internal tools and leading a junior team in a dynamic EdTech environment.
Software Engineering Intern with a focus on healthcare SaaS applications. Collaborating on secure, scalable software solutions while gaining exposure to AI and cloud development.
Software Engineering Intern developing AI - enabled tools to improve productivity at Provation. Collaborating with senior engineers and contributing to automation and AI integration initiatives.