Data Platform Engineer developing data ingestion pipelines for a high-volume data processing system at Mercari. Collaborating with teams to ensure successful data integration.
Responsibilities
Design, develop, and maintain data ingestion pipelines for a high-volume data processing system that collects data from mobile and web applications (clients).
Develop and maintain streaming data pipelines to ingest raw data and write it to a data warehouse and lake house.
Implement batch data transformation pipelines.
Write SQL queries to extract, transform, and load data from various sources.
Collaborate with data scientists, data analysts, and software engineers to understand data requirements, manage data schemas, and ensure successful data integration.
Manage and maintain the CI/CD release pipeline.
Utilize Docker, YAML, Bash scripting, Terraform, and other technologies to automate infrastructure provisioning and deployments.
Monitor and troubleshoot pipeline issues to ensure smooth data flow and data quality.
Write clean, maintainable, and well-documented code.
Develop and execute on the long-term goals and roadmap of the data platform.
Develop and maintain a very high RPS REST service to receive user events from clients.
Develop and maintain logging SDK for the server side system.
Requirements
Resonates with the mission and values of the Mercari Group and its individual companies
Experience with streaming data processing frameworks like Apache Beam or Spark or Flink.
Experience with Data Warehouse technologies like Google BigQuery, Amazon Redshift, Hive/Hadoop or Snowflake.
Experience designing, developing, and operating large-scale services and/or distributed systems or data pipelines using a variety of programming languages including Go, Python, Java, Scala.
Experience with building APIs and using data serialization formats (e.g., Protobuf, Avro, Parquet).
Experience in writing design documents or technical proposals and reaching agreements with stakeholders.
Familiarity with monitoring and alerting tools.
Experience with Google Cloud Platform (Dataflow, Pubsub, Kubernetes Engine, Compute Engine)
Experience with Confluent Cloud or Apache Kafka.
Experience with Workflow engines like Argo Workflow or Apache Airflow.
Platform Engineer managing OpenStack environments for Cloudera. Deploying, troubleshooting, and improving OpenStack systems with Kubernetes integration.
Platform Engineer at Cloudera configuring bare - metal servers and managing OpenStack infrastructure. Collaborating with teams to ensure optimized performance and reliability in datacenter environments.
Senior Staff Platform Engineer deploying and managing OpenStack environments at Cloudera. Collaborating with teams to improve integration with Kubernetes and contribute to open - source development.
Senior Platform Engineer designing, improving, and scaling infrastructure for Stay22's platform. Collaborating with engineering teams to enhance system performance and reliability.
Lead Data Platform Engineer overseeing design and operation of IoT data platform for Vizzia. Ensure reliable data access and governance while supporting internal teams and AI initiatives.
Platform Engineer ensuring stable operations and excellent developer experience across a hybrid benefits platform. Join our fast - paced team to create impactful solutions in a collaborative environment.
Senior Azure Platform Engineer at Orderfox, evolving the Azure platform for AI agents. Focused on automation, CI/CD, and cost tracking for efficient operations.
Senior Staff Platform Engineer deploying and managing OpenStack environments at Cloudera. Collaborating with teams to integrate OpenStack with Kubernetes and ensure high performance.
Software Engineering Developer at Kyndryl designing and implementing software solutions for clients. Collaborating on complex projects using advanced technologies and methodologies.
AI Platform Engineer at Utica National Insurance Group responsible for evaluating, designing, and implementing AI/ML solutions. Collaborating with internal teams and ensuring effective use of AI - driven tools.