Senior Data Engineer designing scalable ETL data pipelines using Databricks for a software consulting company. Collaborating with teams to implement robust data solutions in diverse business environments.
Responsibilities
Design and implementation of robust, scalable, and high-performance ETL/ELT data pipelines using PySpark/Scala and Databricks SQL on the Databricks platform;
Strong expertise in implementing and optimizing the Medallion Architecture (Bronze, Silver, Gold) using Delta Lake, ensuring data quality, consistency, and historical tracking.
Efficient implementation of the Lakehouse architecture on Databricks, combining best practices from traditional Data Warehousing and Data Lake paradigms;
Optimization of Databricks clusters, Spark operations, and Delta tables (e.g. Z-Ordering, compaction, query tuning) to reduce latency and compute costs;
Design and implementation of real-time and near-real-time data processing solutions using Spark Structured Streaming and Delta Live Tables (DLT);
Implementation and administration of Unity Catalog for centralized data governance, fine-grained security (row- and column-level security), and end-to-end data lineage;
Definition and implementation of data quality standards and validation rules (e.g. using DLT or Great Expectations) to ensure data integrity and reliability;
Development and management of complex workflows using Databricks Workflows (Jobs) or external orchestration tools such as Azure Data Factory or Airflow to automate data pipelines;
Integration of Databricks pipelines into CI/CD processes using Git, Databricks Repos, and Databricks Bundles;
Close collaboration with Data Scientists, Analysts, and Architects to translate business requirements into optimal technical solutions;
Providing technical mentorship to junior engineers and promoting engineering best practices across the team.
Requirements
Proven, expert-level experience across the full Databricks ecosystem, including Workspace management, cluster configuration, notebooks, and Databricks SQL.
In-depth knowledge of Spark architecture (RDDs, DataFrames, Spark SQL) and advanced performance optimization techniques;
Strong expertise in implementing and managing Delta Lake features, including ACID transactions, Time Travel, MERGE operations, OPTIMIZE, and VACUUM;
Advanced/expert proficiency in Python (PySpark) and/or Scala (Spark);
Expert-level SQL skills and strong experience with data modeling approaches (Dimensional Modeling, 3NF, Data Vault);
Solid hands-on experience with a major cloud platform (AWS, Azure, or GCP), with a strong focus on cloud storage services (S3, ADLS Gen2, GCS) and networking fundamentals.
**
**Nice to have**
Practical experience implementing and administering Unity Catalog for centralized governance and fine-grained access control;
Hands-on experience with Delta Live Tables (DLT) and Databricks Workflows for building and orchestrating data pipelines;
Basic understanding of MLOps concepts and hands-on experience with MLflow to support collaboration with Data Science teams;
Experience with Terraform or equivalent Infrastructure as Code (IaC) tools;
Databricks certifications (e.g. Databricks Certified Data Engineer Professional) are considered a significant advantage;
Bachelor’s degree in Computer Science, Engineering, Mathematics, or a related technical field;
5+ years of experience in Data Engineering, including at least 3+ years working with Databricks and Apache Spark at scale.
Benefits
Premium medical package
Lunch Tickets & Pluxee Card
Bookster subscription
13th salary and yearly bonuses
Enterprise job security with a startup mentality (diverse & engaging environment, international exposure, flat hierarchy) under the stability of a secure multinational
A supportive culture (we value ownership, autonomy, and healthy work-life balance) with great colleagues, team events and activities
Flexible working program and openness to remote work
Collaborative mindset – employees shape their own benefits, tools, team events and internal practices
Diverse opportunities in Software Development with international exposure
Flexibility to choose projects aligned with your career path and technical goals
Access to leading learning platforms, courses, and certifications (Pluralsight, Udemy, Microsoft, Google Cloud)
Career growth & learning – mentorship programs, certifications, professional development opportunities, and above-market salary
Senior Data Engineer at Keyrus leading the design, development, and delivery of scalable data platforms. Collaborating with teams to translate requirements into production - grade solutions and mentoring engineers.
Senior Data Engineer for global payments platform designing ETL pipelines and data models. Collaborating across teams to tackle complex data challenges in an innovative fintech environment.
Data Warehouse Modelling Engineer designing and maintaining data models using Data Vault 2.0 for iGaming industry. Collaborating with stakeholders and optimizing data models in a hybrid work environment.
Senior Data Engineer driving impactful data solutions for the climate logistics startup HIVED's core data platform. Collaborating with cross - functional squads to enhance analytics and delivery.
Data Engineer developing and maintaining CRE forecasting infrastructure for Cushman & Wakefield. Collaborates with senior economists and technical teams to ensure high - quality data solutions.
Data Engineer at PwC, engaging with Azure cloud services to enhance data handling and integrity. Responsibilities include pipeline optimizations, documentation, and collaboration with stakeholders.
Data Engineer Manager at PwC focusing on building data infrastructure and solutions. Leading data engineering projects to transform raw data into actionable insights and drive business growth.