Staff Data Engineer on Real World Evidence team driving large-scale data initiatives. Collaborating with cross-functional teams to optimize data pipelines and improve healthcare outcomes.
Responsibilities
Act as a self-starter who drives execution independently, taking ownership and initiative with minimal need for day-to-day direction.
Lead high-visibility RWE projects, starting with claims data, and keep multiple initiatives moving by proactively unblocking teams.
Own the end-to-end architecture for critical data assets, ensuring solutions are scalable, reliable, and aligned with H1’s long-term vision.
Design, build, and optimize large-scale data pipelines (hundreds of TBs) for performance, reliability, and cost efficiency.
Partner with Product, Data Science, and downstream engineering teams to align priorities, manage dependencies, and deliver high-value outcomes.
Represent engineering in cross-functional forums, shaping roadmaps and reducing reliance on senior leadership for day-to-day decisions.
Develop deep domain expertise and mentor other engineers, helping raise the technical bar and influence the evolution of our data products.
Requirements
8+ years as a software, data, or backend engineer building and operating scalable, production-grade systems.
Experience with large-scale data processing (e.g., Spark/PySpark on EMR or similar) or scalable distributed backend systems, with the ability to quickly deepen expertise in our data stack (PySpark, EMR, Hudi/Delta).
Strong proficiency in SQL, including writing and optimizing complex queries over large datasets.
Strong programming experience in Python (or a modern language with the ability to quickly ramp up in Python).
Experience designing systems or large-scale datasets/pipelines with attention to performance, reliability, and maintainability.
Hands-on experience with modern engineering workflows and tooling such as Git, JIRA, and CI/CD systems (e.g., CircleCI).
Comfort deploying and troubleshooting distributed workloads in cloud environments such as AWS EMR or Kubernetes.
Experience with workflow orchestration or job scheduling tools (e.g., Airflow, Argo).
Demonstrated ability to independently drive complex, cross-team technical initiatives and influence stakeholders without formal authority.
Experience with streaming/messaging technologies (e.g., Kafka, Kinesis) nice to have
Background in RWE, healthcare data, or other complex/regulated data domains is preferred
Experience using AI-assisted coding tools (e.g., GitHub Copilot, Claude Code) to accelerate development while maintaining quality is encouraged
Benefits
Full suite of health insurance options, in addition to generous paid time off
Pre-planned company-wide wellness holidays
Retirement options
Health & charitable donation stipends
Impactful Business Resource Groups
Flexible work hours & the opportunity to work from anywhere
Data Engineering Intern assisting with data projects and cloud solutions at Simmons Bank. Collaborating on data pipelines and gaining exposure to modern data engineering concepts.
Data Engineer building and scaling client - facing Microsoft Fabric analytics platform to drive revenue and decision - making. Collaborating with teams to develop pipelines, optimize performance, and ensure client satisfaction.
Data Engineer role focusing on migrating legacy systems to ADA at BBVA. Collaborate with multidisciplinary teams and ensure system integrity during transitions.
Senior Data Engineer focused on modernizing enterprise data capabilities at U.S. Bank. Designing and building reusable data engineering patterns for consistent delivery across teams.
Experienced Data Architect designing and implementing scalable data architecture for a financial services and healthcare technology company. Collaborating across teams to support analytics and operational needs.
Principal Data Pipeline Lead at SS&C overseeing development of scalable data pipelines. Leading a small team and providing technical guidance for modern data platform integration.
Senior Data Engineer at SS&C building and optimizing data pipelines in a lakehouse environment. Collaborating with data architects and stakeholders in the financial services sector.
Data Architect designing scalable, secure data architectures for fraud detection and risk management at Fiserv. Collaborating with cross - functional teams and managing large datasets and pipelines.
Director of Engineering overseeing development of AI - driven data platforms at LVT. Leading teams to transform sensor data into actionable insights using modern architecture and technologies.
Senior Data Engineer at Independence Pet Holdings shaping data ecosystem by building platforms and pipelines. Collaborating with teams to enhance data analytics and operational insights.