Big Data Engineer focused on architecting and deploying scalable Apache Spark environments. Working on Nebul’s sovereign AI cloud to enhance data performance and security.
Responsibilities
Architect, deploy, and operate scalable Apache Spark environments on Nebul’s sovereign AI cloud
Design and optimize Spark workloads for GPU-accelerated and distributed performance
Define and implement best practices for security, monitoring, governance, and data protection
Partner closely with product, engineering, and customer teams to shape our managed Spark offering
Evaluate and integrate complementary technologies (e.g., Delta Lake, Lakehouse components, tooling)
Support early customer pilots and translate feedback into roadmap improvements
Develop automation and CI/CD deployment models to ensure reliability, repeatability, and efficiency
Document architectures, operational procedures, and performance benchmarks
Requirements
4–7 years of experience working with Apache Spark in production environments
Strong deep-dive knowledge of Spark internals: performance tuning, partition strategies, caching, and shuffle management
Hands-on deployment experience in Kubernetes, cloud infrastructure, or on-prem clusters
Solid understanding of distributed data platforms (e.g., Databricks, EMR, Hadoop, Lakehouse architectures)
Strong scripting and automation skills (Python / Scala preferred)
Ability to translate client needs into technical architectures and operational models
Familiarity with cloud-security principles and infrastructure-as-code practices
Valid EU work permit (no sponsorship currently available)
Data Engineer developing data platforms for Octopus Electric Vehicles. Build and optimise data solutions to transform insights for sustainable transportation.
Senior Data Engineer developing and maintaining data pipelines and solutions for EMC's data analytics products. Collaborating with cross - functional teams to ensure high - quality data management and governance practices.
Principal Product Manager leading product strategy for health data platform at PointClickCare. Collaborating across teams to optimize health data for analytics and care delivery.
Senior Data Engineer designing and optimizing data platforms for clients using Microsoft Azure, Microsoft Fabric, Power BI, and Databricks. Working closely with clients to deliver scalable solutions.
Data Engineer providing technical expertise on mission - critical NAVSUP OIS program. Work involves data architecture and database management in AWS GovCloud environments.
Senior Data Engineer focusing on data infrastructure for an AI - driven insurtech startup based in Nepal. Collaborating with teams to optimize data models and maintain data quality.
Senior Professional Consultant leading architecture and design for SAP BW and SAC solutions at Freudenberg. Collaborating with stakeholders and optimizing performance of data landscapes.
Senior Data Engineer designing and managing data architectures to transform large - scale data into insights for Humana. Involves leading technical discussions and implementing best data practices.
Data Engineer II at Early Warning Services developing data science tools and infrastructure. Collaborating on software enhancements and mentoring interns in a hybrid work environment.