Data Engineer ETL responsible for AI model data cleaning and development. Leading projects in a hybrid work environment in Singapore.
Responsibilities
1. Responsible for data cleaning (ETL) and data warehouse construction to support large-scale AI models.
2. Responsible for training and fine-tuning large AI models to meet the requirements of specific business scenarios.
3. Responsible for developing supporting tools, such as dashboards and general business logic, to ensure the practicality of AI model applications.
4. Must have hands-on development experience and be able to lead a team or independently complete projects related to data collection and development.
Requirements
1. A degree in computer science or a related field is preferred. Must be familiar with professional knowledge in machine learning, deep learning, and natural language processing, with at least 1 year of experience in GPT or Gemini application development, and proficient in deep learning frameworks such as PyTorch or TensorFlow.
2. Familiar with models such as Transformer, BERT, GPT, and fine-tuning algorithms like LoRA, with experience in fine-tuning models.
3. Must have Java programming experience.
4. Experience in backend Java development for data engineering use cases, particularly real-time processing with Apache Flink.
5. Must have experience in data warehouse development and construction, such as using Flink and building ETL data cleaning pipelines.
6. Experience with large model pre-training and practical application in business scenarios is a plus.
7. Must have hands-on experience in setting up large models based on open-source frameworks.
8. Experience in conversational AI, marketing content generation, or machine translation is preferred.
9. Priority will be given to candidates with hands-on experience in Google Cloud Platform (GCP), particularly those with experience in BigQuery.
Benefits
1. Lead community-building for Southeast Asia's largest parenting ecosystem
2. Be at the forefront of connecting brands with real parents in authentic and impactful ways.
3. Work with a passionate team driving innovation in the parenting space.
4. Regional exposure across three of SSEA's most dynamic markets.
Data Engineer designing data pipelines in Python for a major railway industry client. Collaborate with Data Scientists and ensure code quality with agile methodologies.
Senior Data Engineer responsible for building and optimizing data pipelines for banking analytics initiatives. Collaborating with data teams to ensure data quality and readiness for enterprise use.
Senior Data Engineer developing scalable data solutions on Databricks for analytics and operational workloads. Collaborating with cross - functional teams to modernize the data ecosystem.
Data Engineer focused on analytics and data pipeline development for network optimisation. Collaborating with teams to deliver high - quality data solutions with Python and SQL.
Senior Product Manager defining platform capabilities for Data Cloud in Salesforce. Collaborating with R&D teams while shaping product strategy for Data 360 integration.
Senior Data Engineer at Goodwin enhancing data platforms and fostering data - driven culture across teams. Collaborating with IT and Finance on technology solutions and data governance practices.
Director, Data Platform Design and Strategy at MedImpact leading data platform and AI innovations to enhance healthcare services. Overseeing enterprise projects and managing teams to meet strategic goals.
Data Engineer delivering AI - and data - driven solutions for Honeywell’s industrial customers. Architecting and implementing scalable data pipelines and platforms focused on IoT and real - time data processing.
Data Engineering Associate focusing on data quality control and management for distribution platform. Collaborates on large scale data projects to ensure data accuracy and availability for users.
Data Architect managing enterprise data platform built on Microsoft Fabric at Johnstone Supply. Leading architectural standards and collaborating with business and IT leaders for strategic data - driven insights.