Sr. Platform Engineer at Comcast responsible for optimizing Kubernetes infrastructure and managing large-scale Spark workloads. Collaborating with teams to ensure performance and reliability in data processing environments.
Responsibilities
Building, managing, and optimizing the underlying infrastructure and tools for large-scale data processing workloads.
Designing systems for collecting metrics (Prometheus) and visualizing data (Grafana).
Architecting and managing the platforms where Spark runs, such as Kubernetes clusters or cloud services like AWS (EKS).
Packaging Spark workloads and integrating them with orchestration systems.
Deploying Infrastructure via Terraform/Ansible and troubleshooting job failures.
Building automation and tools in languages like Python, Java, or Scala, Linux Scripting (Bash).
Implementing and maintaining systems for monitoring, logging, and alerting.
Developing and optimizing the data catalog platform (e.g., Apache Iceberg).
Collaborating with Data Stewards, Analysts, and Scientists to address data needs and issues.
Creating and maintaining documentation for Kubernetes infrastructure and providing training to team members.
Requirements
Bachelor's degree in computer science or a related field, or equivalent experience, typically 7 years in a DevOps or Systems Engineering role.
Expertise in Apache Spark: Deep understanding of Spark architecture, including RDDs, DataFrames, execution hierarchy, lazy evaluation, shuffling, and fault tolerance.
Proficiency in languages used for Spark development and automation, such as Python, Pyspark and Scala/Java.
Proficient in Linux Scripting (Bash).
Proficient in writing SQL.
Experience in CI/CD tools, Github.
Experience in setting up and using observability tools like Prometheus, Grafana etc.
Strong knowledge on Networking Protocols (TCP/IP, DNS, Load Balancer etc.) and hardware components.
Automation via Terraform/Ansible.
Hands-on experience with on-prem and major cloud providers (AWS, Azure, GCP) and container orchestration tools like Docker and Kubernetes.
Hands-on experience setting up IAM, VPC, EC2 etc.
Familiarity with related technologies and formats like Delta Lake, Apache Iceberg, Apache Kafka, Hadoop, and various data storage systems (S3, HDFS, etc.).
Hands-on experience with Databricks, Snowflake, Apache Iceberg, Unity Catalog, or similar tools.
Solid understanding of data lakes and governance.
Experience setting up, maintaining caching layers like Alluxio.
Strong analytical skills for debugging complex distributed systems issues.
Strong communication and collaboration abilities.
Benefits
Best-in-class Benefits to eligible employees
Expert guidance and always-on tools
Support physically, financially and emotionally during big milestones and in everyday life
Senior Data Platform Engineer optimizing data processes for a Montreal IT consulting firm. Involving governance, ingestion pipelines, and scalable architecture in data management.
Platform Engineer developing and operationalizing IronSled to enable secure software delivery at scale for government environments. Leveraging cloud and on - premise infrastructure compliant with federal standards.
Freelance Data Platform Engineer for fintech, focused on Snowflake and AWS configurations. Enhancing data pipelines and ensuring robust data flows for a modern data environment.
Linux Platform Engineer building and scaling infrastructure pipelines for MULTIVAC. Collaborating on transformation from Windows environments to Linux container platforms.
Lead platform engineering at Chain IQ, focusing on shared systems and automation for global procurement solutions. Spearhead initiatives for coherent and scalable platform foundations to enhance developer experiences.
Founding Platform Engineer to design, build, and ship core systems at CEF AI. Contribute to AI infrastructure for real - time data processing and developer tooling in a hybrid role.
Senior Software Platform Engineer focusing on developing software components for building management system at Johnson Controls. Requires onsite work at least 3 days a week in Glendale, WI.
Development Lead in Automation/Artificial Intelligence team driving A/AI efforts across processes and functions. Responsible for full - stack development and leading projects in GenAI /Agentic AI.
Senior Platform Engineer overseeing operationalization for Comcast Business in Texas, Pennsylvania, Virginia, or Colorado. Leading technical solutions and mentoring teams to enhance platforms and user experiences.
Senior Platform Engineer at MoonPay improving cloud platform resiliency and reliability. Collaborate with teams to implement solutions using Kubernetes and monitor performance metrics.