Data Warehouse Developer responsible for developing and maintaining an open-source Data Lakehouse at Uni Systems. Involves designing data pipelines and ensuring data quality across multiple sources.
Responsibilities
Development and maintenance of a fully open-source Data Lakehouse.
Design and development of data pipelines for scalable and reliable data workflows to transform extensive quantities of both structured and unstructured data.
Data integration from various sources, including databases, APIs, data streaming services and cloud data platforms.
Optimisation of queries and workflows for increased performance and enhanced efficiency.
Writing modular, testable and production-grade code.
Ensuring data quality through monitoring, validation and data quality checks, maintaining accuracy and consistency across the data platform.
Elaboration of test programs.
Document processes comprehensively to ensure seamless data pipeline management and troubleshooting.
Assistance with deployment and configuration of the system.
Participation in meetings with other project teams.
Requirements
Bachelor degree in IT or related field and 13 years of professional experience in IT.
Excellent knowledge of data warehouse and/or data lakehouse design & architecture.
Excellent knowledge of open-source, code-based data transformation tools such as dbt, Spark and Trino.
Excellent knowledge of SQL.
Good knowledge of Python.
Good knowledge of open-source orchestration tools such as Airflow, Dagster or Luigi.
Experience with AI-powered assistants like Amazon Q that can streamline data engineering processes.
Good knowledge of relational database systems.
Good knowledge of event streaming platforms and message brokers like Kafka and RabbitMQ.
Extensive experience in creating end-to-end data pipelines and the ELT framework.
Understanding of the principles behind storage protocols like Apache Iceberg or Delta Lake.
Proficiency with Kubernetes and Docker/Podman.
Good knowledge of data modelling tools.
Good knowledge of online analytical data processing (OLAP) and data mining tools.
Fluent in English at least at a level C1.
Benefits
At Uni Systems, we are providing equal employment opportunities and banning any form of discrimination on grounds of gender, religion, race, color, nationality, disability, social class, political beliefs, age, marital status, sexual orientation or any other characteristics.
Take a look at our Diversity, Equality & Inclusion Policy for more information.
Junior Data Engineer at OneMarketData focusing on data quality and integrity in financial datasets. Collaborating with senior analysts and assisting in data management and analysis tasks.
Senior Data Engineering Analyst developing and implementing data solutions. Collaborating in a diverse environment focused on data processing and analysis for clients' digital transformation.
Principal Software Engineer in Threat Data Platform developing AI - driven tools for threat intelligence automation. Collaborating on robust data pipelines for PANW’s product ecosystem.
Senior Azure Data Engineer maintaining business intelligence solutions for Grupo Gloria, implementing and stabilizing projects in Azure and Databricks with Power BI reporting.
Staff Data Engineer at URBN developing AI - powered digital experiences by integrating algorithmic solutions with creative tools. Collaborating with cross - functional teams for impactful product evolution.
Senior Data Engineer at Anglian Water responsible for scalable data solutions and team collaboration. Leading design, build, and operation of secure data pipelines for critical services.
Data Engineer developing complex data pipelines for Symphony, a global software company. Collaborating with teams and driving data solutions in a hybrid work environment.
Data Engineer focused on building and maintaining data pipelines using SQL Server and T - SQL. Designing data solutions for reporting and analytics from various internal and third - party systems.
Data Engineer responsible for building scalable data solutions and collaborating with various teams. Focused on data extraction, transformation, and maintaining optimal architecture.