Senior Data Engineer building scalable data pipelines at Monstro, an innovative fintech company. Shape the future of data architecture in a data-intensive environment.
Responsibilities
Build and own scalable pipelines that parse and normalize unstructured sources for retrieval, knowledge graphs, and agents.
Conceive and implement novel processes for processing thousands of types of unstructured documents with accuracy and consistency.
Process semi structured sources into consistent, validated schemas.
Transform structured datasets for analytics, features, and retrieval workloads.
Create, version, and maintain multiple collections in a vector database.
Manage embeddings, metadata, and lifecycle, and tune chunking and filters for relevance and latency.
Design and implement robust multi-modal document processing systems that handle heterogeneous file formats (PDFs, images, HTML, XML).
Own ingestion from APIs, file drops, partner feeds, and scheduled jobs with monitoring, retries, and alerting.
Implement data quality checks for schema, ranges, and nulls, and document lineage and SLAs.
Stand up and harden object, relational, document, and vector stores with the right indexing and partitioning.
Build reusable libraries and services for parsing, enrichment, and embedding generation.
Handle sensitive financial and personal data with access controls, auditing, and retention policies.
Partner with product and engineering to ship features that depend on reliable data.
Document standards, coach teammates, and contribute to future hiring.
Requirements
Minimum 2 years in a dedicated Data Engineering role at an AI-native startup or 4+ years of experience in traditional Data Engineering, with ~8+ years of experience in Tech overall.
Proven ownership of end-to-end pipelines (ingestion → transformation → serving), including scalable sourcing processes, ETL pipelines, and serving services.
Experience owning and operating infrastructure in production environments.
Strong Python and SQL.
Hands on document parsing and ETL across PDFs, HTML, JSON, and XML.
Experience operating vector databases such as pgvector, Pinecone, or Weaviate, with multiple collections.
Building and scheduling ingestion via APIs, web downloads, and cron or an orchestrator, plus cloud storage and queues.
Understanding of embeddings, chunking strategies, metadata design, and retrieval evaluation.
Solid data modeling, schema design, indexing, and performance tuning across storage types.
History of implementing data quality checks, observability, and access controls for sensitive data.
Track record of delivering high-consistency systems for mission-critical data pipelines.
Ownership mindset, clear written communication, and effective collaboration with product and engineering.
Senior Data Engineer role at Dun & Bradstreet focused on data analytics and visualization. Collaborating with teams to optimize data processes and deliver actionable insights.
Senior Data Engineer with AWS expertise leading financial data architecture and scalable solutions. Collaborating in wealth management to enhance data quality and systems.
Data Migration Specialist handling large - scale data migration from legacy to enterprise PLM platform. Analyzing data structures, developing strategies, and ensuring integrity across systems.
Director leading strategy, governance, and delivery of enterprise data platform at Phillips 66. Partnering with AI, Data Science, and business teams to enhance analytics and business systems.
Product Owner driving ERP data migration initiatives for BioNTech’s global landscape. Leading effective data management and ensuring compliance with regulatory standards in a fast - paced environment.
Data Engineer II leading development and delivery of data pipelines for Syneos Health. Collaborating with teams to optimize data processing and integrate solutions into production environments.
Lead Data Engineer overseeing data operations and analytics engineering teams for OneOncology. Focused on operational excellence in data platform and model reliability for cancer care improvement.
Senior AWS Software Data Engineer at Boeing focusing on AWS Data services to support digital analytics capabilities. Collaborating with cross - functional teams to design, develop, and maintain software data solutions.
Senior Data Engineer designing and improving software for business capabilities at Barclays. Collaborating with teams to build a data and intelligence platform for Equity Derivatives.