Lead Data Engineer designing and managing AWS data pipelines and platforms for AI & Data Engineering team. Involves collaborating with data scientists, analysts, and stakeholders for data-driven solutions.
Responsibilities
Design and implement scalable ETL/ELT pipelines using AWS Glue, Spark (PySpark), and Step Functions
Work with structured and semi-structured data using Athena, S3, and Lake Formation to enable efficient querying and access control
Develop and deploy serverless data processing solutions using AWS Lambda and integrate them into pipeline orchestration
Perform advanced SQL and PL/SQL development for data transformation, analysis, and performance tuning
Build data lakes and data warehouses using S3, Aurora, and Athena
Implement data governance, security, and access control strategies using AWS tools including Lake Formation, CloudFront, EBS/EFS, and IAM
Develop and maintain metadata, lineage, and data cataloging capabilities
Participate in data modeling exercises for both OLTP and OLAP environments
Work closely with data scientists, analysts, and business stakeholders to understand data requirements and deliver actionable insights
Monitor, debug, and optimize data pipelines for reliability and performance.
Requirements
Must have: Python, SQL/PLSQL, AWS, Postgresql, S3, Glue
Good to have: CDK, GitHub
Strong experience with AWS data services: Glue, Athena, Step Functions, Lambda, Lake Formation, S3, EC2, Aurora, EBS/EFS, CloudFront
Proficient in PySpark, Python, SQL (basic and advanced), and PL/SQL
Solid understanding of ETL/ELT processes and data warehousing concepts
Familiarity with modern data platform fundamentals and distributed data processing
Experience in data modeling (conceptual, logical, physical) for analytical and operational use cases
Experience with orchestration and workflow management tools within AWS
Strong debugging and performance tuning skills across the data stack.
Senior Product Manager defining platform capabilities for Data Cloud in Salesforce. Collaborating with R&D teams while shaping product strategy for Data 360 integration.
Senior Data Engineer at Goodwin enhancing data platforms and fostering data - driven culture across teams. Collaborating with IT and Finance on technology solutions and data governance practices.
Director, Data Platform Design and Strategy at MedImpact leading data platform and AI innovations to enhance healthcare services. Overseeing enterprise projects and managing teams to meet strategic goals.
Data Engineer delivering AI - and data - driven solutions for Honeywell’s industrial customers. Architecting and implementing scalable data pipelines and platforms focused on IoT and real - time data processing.
Data Engineering Associate focusing on data quality control and management for distribution platform. Collaborates on large scale data projects to ensure data accuracy and availability for users.
Data Architect managing enterprise data platform built on Microsoft Fabric at Johnstone Supply. Leading architectural standards and collaborating with business and IT leaders for strategic data - driven insights.
Data Engineer at Studyportals responsible for data pipelines and infrastructure. Join a team ensuring accurate and trustworthy data for analytics and business decisions.
AI/ML Engineer designing and refining prompts and workflows using large language models. Responsible for developing data pipelines and delivering scalable AI solutions in a hybrid work environment.
AWS Data Architect at Fractal designing and operationalizing AWS data solutions at enterprise scale. Collaborating with clients and mentoring engineers in best practices.
Senior Data Engineer driving data - driven success at Pacific Life. Collaborating with a team to build scalable and secure data solutions in Newport Beach, CA or Charlotte, NC.