Data Engineer II building pilot datasets and production-grade data platforms at GeoComply. Collaborating with product teams to deliver data-driven features for geolocation compliance.
Responsibilities
Build Pilot Datasets: Rapidly design and develop experimental data models and datasets to support pilot product features and validate hypotheses.
Bridge Business & Data: Collaborate closely with product managers to translate functional requirements into initial data schemas and logic.
Ad-Hoc to Self-Serve: Construct initial logic for ad-hoc data requests and evolve them into standardized, self-serve tools for the product team.
Productionize Pipelines: Take successful pilot datasets and transform them into robust, production-grade data pipelines. Refactor "pilot code" to follow best practices, ensuring high performance and data quality in the production environment.
Foundation Development: Build and maintain the internal libraries, services (Airflow), and Databricks jobs required to run these datasets at scale.
Project Management: Manage the lifecycle of data products from "epic-size" concepts through to delivery and maintenance.
Stakeholder Communication: Effectively communicate technical constraints and data insights to stakeholders during the transition from pilot to production.
Product Ideation: Actively contribute ideas on how data can drive new product benefits during weekly team calls.
Requirements
Four (4) years of relevant experience, with a focus on bridging database technology with product requirements.
You possess strong product thinking skills—the ability to analyze requirements, identify customer pain points, and build data solutions that contribute directly to the product vision.
Strong skills in Databricks, Spark/PySpark, and SparkSQL to manipulate heavy datasets during both the discovery and production phases.
Extensive experience with MySQL (for operational data) and NoSQL, with an understanding of how to model data for analytics vs. applications.
Proven ability to operate on epic-size projects, specifically managing the timeline from "proof of concept" to "delivered feature".
Strong interpersonal skills, adept at explaining complex data concepts to non-technical product stakeholders.
Familiarity with Git, Linux environments, and the Prom/Loki Stack for monitoring data health
Senior Data Architect defining comprehensive data strategy and architecture for AI. Delivering organization’s data vision and ensuring governance and technical oversight of enterprise data architecture.
Data Engineer at Booz Allen utilizing data to impact critical missions like fraud detection and cancer research. Collaborating with analysts and developers on advanced technology solutions.
Technical Product Manager for Data Engineering at Betclic. Owning product roadmaps, driving data infrastructure evolution, and ensuring alignment across engineering teams.
Senior Data Engineer maturing strategic data assets and delivering business analytics in a regulated financial environment. Collaborating with stakeholders to advance business data strategy on cloud platforms.
Staff Data Engineer leading scalable data solutions for analytics and reporting at Asurion. Designing data pipelines and ensuring data quality across cloud platforms.
Lead Architect at Travelers shaping enterprise database and data platform solutions. Collaborating across technology and business units to drive digital transformation and modernization efforts.
Senior Data Engineer at Porto Bank responsible for developing robust data architecture and ensuring data quality across teams. Collaborating with various data solutions for effective data management.
Data Engineer II at Dun & Bradstreet collaborating with teams to enhance data quality standards and practices. Driving best - in - class data management across diverse disciplines.
Senior Data Engineer at NVIDIA developing and optimizing database architectures while collaborating across software and hardware teams. Focus on data - driven systems in data center environments for complex networking verification.
Senior Data Engineer building reliable data infrastructure for AI - powered health experiences. Collaborating on data pipelines and ensuring data quality in a hybrid work environment.