Databricks Architect leading enterprise data platform implementations at Allata. Blending architectural responsibilities with technical leadership in data products and pipelines.
Responsibilities
Define the overall data platform architecture (Lakehouse/EDW), including reference patterns (Medallion, Lambda, Kappa), technology selection, and integration blueprint.
Design conceptual, logical, and physical data models to support multi-tenant and vertical-specific data products; standardize logical layers (ingest/raw, staged/curated, serving).
Establish data governance, metadata, cataloging (e.g., Unity Catalog), lineage, data contracts, and classification practices to support analytics and ML use cases.
Define security and compliance controls: access management (RBAC/IAM), data masking, encryption (in transit/at rest), network segmentation, and audit policies.
Architect scalability, high availability, disaster recovery (RPO/RTO), and capacity & cost management strategies for cloud and hybrid deployments.
Lead selection and integration of platform components (Databricks, Delta Lake, Delta Live Tables, Fivetran, Azure Data Factory / Data Fabric, orchestration, monitoring/observability).
Design and enforce CI/CD patterns for data artifacts (notebooks, packages, infra-as-code), including testing, automated deployments and rollback strategies.
Define ingestion patterns (batch & streaming), file compacting/compaction strategies, partitioning schemes, and storage layout to optimize IO and costs.
Specify observability practices: metrics, SLAs, health dashboards, structured logging, tracing, and alerting for pipelines and jobs.
Act as technical authority and mentor for Data Engineering teams; perform architecture and code reviews for critical components.
Collaborate with stakeholders (Data Product Owners, Security, Infrastructure, BI, ML) to translate business requirements into technical solutions and roadmap.
Design, develop, test, and deploy processing modules using Spark (PySpark/Scala), Spark SQL, and database stored procedures where applicable.
Build and optimize data pipelines on Databricks and complementary engines (SQL Server, Azure SQL, AWS RDS/Aurora, PostgreSQL, Oracle).
Implement DevOps practices: infra-as-code, CI/CD pipelines (ingestion, transformation, tests, deployment), automated testing and version control.
Troubleshoot and resolve complex data quality, performance, and availability issues; recommend and implement continuous improvements.
Requirements
Previous experience as architect or lead technical role on enterprise data platforms.
Hands-on experience with Databricks technologies (Delta Lake, Unity Catalog, Delta Live Tables, Auto Loader, Structured Streaming).
Strong expertise in Spark (PySpark and/or Scala), Spark SQL and distributed job optimization.
Solid background in data warehouse and lakehouse design; practical familiarity with Medallion/Lambda/Kappa patterns.
Experience integrating SaaS/ETL/connectors (e.g., Fivetran), orchestration platforms (Airflow, Azure Data Factory, Data Fabric) and ELT/ETL tooling.
Experience with relational and hybrid databases: MS SQL Server, PostgreSQL, Oracle, Azure SQL, AWS RDS/Aurora or equivalents.
Proficiency in CI/CD for data pipelines (Azure DevOps, GitHub Actions, Jenkins, or similar) and packaging/deployment of artifacts (.whl, containers).
Experience with batch and streaming processing, file compaction, partitioning strategies and storage tuning.
Good understanding of cloud security, IAM/RBAC, encryption, VPC/VNet concepts, and cloud networking.
Familiarity with observability and monitoring tools (Prometheus, Grafana, Datadog, native cloud monitoring, or equivalent).
Benefits
At Allata, we value differences.
Allata is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees.
Senior Manager leading a team of database engineers to manage CCC's data platform. Overseeing mission - critical applications and collaborating with cross - functional teams in a hybrid environment.
As a Principal Data Architect at Solstice, lead the design and implementation of data architecture solutions. Ensure data integrity, security, and accessibility to meet strategic organizational goals.
Data Platform Specialist overseeing data workflows and enhancing data quality for Stackgini's AI - driven IT solutions. Collaborating with teams to drive improvements and stakeholder support.
Data Engineer designing data pipelines in Python for a major railway industry client. Collaborate with Data Scientists and ensure code quality with agile methodologies.
Senior Data Engineer responsible for building and optimizing data pipelines for banking analytics initiatives. Collaborating with data teams to ensure data quality and readiness for enterprise use.
Senior Data Engineer developing scalable data solutions on Databricks for analytics and operational workloads. Collaborating with cross - functional teams to modernize the data ecosystem.
Data Engineer focused on analytics and data pipeline development for network optimisation. Collaborating with teams to deliver high - quality data solutions with Python and SQL.
Senior Product Manager defining platform capabilities for Data Cloud in Salesforce. Collaborating with R&D teams while shaping product strategy for Data 360 integration.
Senior Data Engineer at Goodwin enhancing data platforms and fostering data - driven culture across teams. Collaborating with IT and Finance on technology solutions and data governance practices.
Director, Data Platform Design and Strategy at MedImpact leading data platform and AI innovations to enhance healthcare services. Overseeing enterprise projects and managing teams to meet strategic goals.