Data Engineer with expertise in Databricks, SQL, and Python for scalable data solutions. Focused on ETL/ELT pipeline development, auditability, data quality, and automated testing.
Responsibilities
Lead the design, development, and optimization of ETL/ELT pipelines using Databricks, Python, Spark, and Delta Lake.
Architect scalable data solutions using Medallion architecture (Bronze, Silver, Gold layers).
Design and implement data models and transformations using SQL and Python.
Build and maintain audit frameworks to ensure traceability, compliance, and data lineage.
Develop data quality monitoring and automated testing frameworks for pipeline reliability.
Perform data analysis to support operational data requests and user queries.
Collaborate with clinical data teams to analyse IRT/RTSM datasets.
Create and maintain dashboards and reports using BI tools (e.g., Superset Power BI, Tableau, Qlik, or similar).
Help manage CICD and automated code branching/deployment.
Ensure compliance with GxP, CDISC, and other regulatory standards.
Mentor junior engineers and promote engineering best practices.
Requirements
8–10 years of experience in data engineering, with leadership or team lead responsibilities.
Strong hands-on experience with Databricks, Apache Spark, and Delta Lake.
Advanced proficiency in SQL and Python for data transformation and automation.
Experience with ETL/ELT orchestration, job optimization, and performance tuning.
Proven experience designing and implementing audit, data quality, and testing frameworks.
Hands-on experience with IRT/RTSM clinical trial data systems.
Strong data analysis skills and ability to interpret complex datasets.
Experience with BI/reporting tools such as Power BI, Tableau, or Qlik.
Knowledge of clinical data standards (e.g., CDISC, SDTM, ADaM).
Experience with cloud platforms (Azure, AWS, or GCP) and CI/CD pipelines.
Data Engineer transforming legacy on - premises systems to cloud - native architectures for advanced data analytics. Collaborating with teams to build efficient data solutions using Python and AWS.
Data Engineering Academy focused on Snowflake and Databricks for professionals interested in expanding their technical capabilities. Fully remote with future office work in Monterrey or Saltillo after completion.
Senior Data Engineer at Intent HQ designing and scaling data platforms. Building high - impact intelligence from millions of customer insights with a focus on performance and reliability.
SAP Data Engineer supporting MERKUR GROUP's evolution into a data - driven company. Responsible for data integration, modeling, and collaboration with various departments in Group Finance.
Data Engineer at Booz Allen Hamilton organizing data and developing advanced technology solutions. Leading data engineering activities for mission - driven projects and mentoring multidisciplinary teams.
Senior Data Engineer at Bristol Myers Squibb developing scalable data pipelines for foundational products. Collaborating with data scientists and IT professionals to ensure data quality and accessibility.
Data Engineer II role focusing on developing and maintaining data pipelines for analytics. Collaborating with Data Science and Analytics teams to ensure data quality and reliability.
Senior Data Architecture Specialist designing and maintaining data integration solutions for Morgan Stanley. Involved in building data architecture and optimizing data storage using various technologies.
Lead Data Engineer responsible for building and maintaining the central HR data lake. Collaborating with analysts and business stakeholders for data - driven decision making.