Big Data Engineer optimizing scalable data solutions using Hadoop, PySpark, and Hive at Citi. Responsible for building ETL pipelines and ensuring data quality in a hybrid work environment.
Responsibilities
Design, develop, and maintain efficient and scalable Big Data solutions using PySpark, Apache Hive, and Hadoop ecosystem tools
Implement and optimize ETL processes and data warehousing solutions
Conduct in-depth data analysis and troubleshoot complex data issues
Optimize Big Data workflows, including Spark job tuning and Hive query optimization
Perform rigorous unit testing and validation of data pipelines
Collaborate with data scientists, analysts, and other engineers
Requirements
Extensive experience in designing, developing, and optimizing scalable data solutions using the Hadoop ecosystem
Strong focus on PySpark and Hive
Strong Python knowledge
Implement and optimize ETL (Extract, Transform, Load) processes and data warehousing solutions
Conduct in-depth data analysis
Optimize Big Data workflows
Perform rigorous unit testing
Collaborate with data scientists, analysts, and other engineers
Data Warehouse Modelling Engineer designing and maintaining data models using Data Vault 2.0 for iGaming industry. Collaborating with stakeholders and optimizing data models in a hybrid work environment.
Senior Data Engineer driving impactful data solutions for the climate logistics startup HIVED's core data platform. Collaborating with cross - functional squads to enhance analytics and delivery.
Data Engineer developing and maintaining CRE forecasting infrastructure for Cushman & Wakefield. Collaborates with senior economists and technical teams to ensure high - quality data solutions.
Data Engineer at PwC, engaging with Azure cloud services to enhance data handling and integrity. Responsibilities include pipeline optimizations, documentation, and collaboration with stakeholders.
Data Engineer Manager at PwC focusing on building data infrastructure and solutions. Leading data engineering projects to transform raw data into actionable insights and drive business growth.
Junior Data Engineer at OneMarketData focusing on data quality and integrity in financial datasets. Collaborating with senior analysts and assisting in data management and analysis tasks.
Senior Data Engineering Analyst developing and implementing data solutions. Collaborating in a diverse environment focused on data processing and analysis for clients' digital transformation.
Principal Software Engineer in Threat Data Platform developing AI - driven tools for threat intelligence automation. Collaborating on robust data pipelines for PANW’s product ecosystem.