Databricks Senior DevOps Engineer designing and operating platforms on AWS and Databricks for Financial Crime. Focused on platform infrastructure, governance, security, and operations.
Responsibilities
Architect, build, and operate end-to-end data and ML platforms on AWS and Databricks.
Own and administer Databricks workspaces for the Financial Crime platform.
Lead DevSecOps and DataOps practices, including infrastructure-as-code (IaC) and CI/CD pipelines for data and ML workflows.
Configure and optimize Databricks compute clusters (job clusters and all-purpose clusters) for performance, scalability, and cost efficiency.
Manage and enforce governance through Unity Catalog, including access control, security policies, data lineage, and isolation.
Build and operate ML infrastructure, including model deployment and serving endpoints.
Integrate AWS services (e.g., S3, Redshift, Kinesis, Lambda, EKS/ECS) with Databricks runtime and Delta Lake.
Implement platform security best practices, including secrets management, audit logging, and compliance controls.
Optimize system performance and diagnose large-scale production issues.
Mentor engineering teams and define architectural best practices for high-scale data and ML systems.
Requirements
7+ years of experience in data platform architecture, cloud infrastructure, or ML platform engineering.
Strong enterprise-level experience with Databricks and AWS.
Deep expertise in Unity Catalog governance and security models.
Hands-on experience designing, deploying, and operating Databricks clusters in production.
Experience managing Model Serving / ML deployment infrastructure.
Strong implementation mindset — able to design, build, deploy, and operate platforms end-to-end.
Experience operating in regulated environments (banking/fintech preferred).
Lead DevOps Engineer focused on AWS and Azure data platform solutions. Collaborating with teams to deliver scalable, secure, and highly available solutions.
DevOps Engineer working at GRÜN Software Group to automate and maintain stable infrastructures. Collaborating with teams to improve deployments and processes for better performance.
Linux System Administrator managing IT infrastructures for educational institutions and research. Collaborating on DevOps and HPC projects while ensuring system security and performance.
Azure SRE Engineer responsible for designing and maintaining secure, scalable Azure cloud infrastructure. Driving automation and operational excellence for leading organizations in technology transformation.
Senior Manager of Site Reliability Engineering overseeing Workday Kubernetes based platform. Leading teams while ensuring high availability and collaborating with federal agencies.
Site Reliability Engineer focusing on AWS cloud environments, SRE practices, and system reliability within GFT's team. Collaborating on cloud migrations and observability initiatives.
Senior DevOps Analyst enhancing infrastructure automation in a transformative technology firm. Collaborating on innovative projects in sectors like healthcare, finance, and utilities in Brazil.
Consultant at Minsait supporting technical decisions in infrastructure automation and developing solutions. Collaborating with teams for maintaining and evolving automation platforms.
Practical Trainee focusing on hardware reliability engineering at Sonova. Support reliability improvement initiatives and work closely with experienced engineers on real - life product challenges.
Configuration Management Engineering Technician supporting naval shipbuilding projects with engineering documentation and configuration integrity. Establishing and maintaining relationships with stakeholders in the shipbuilding community.