Mid-level Data Engineer ensuring efficient data transformation and integration for data annotation projects. Collaborating with teams to optimize data quality and performance in pipeline operations.
Responsibilities
Perform data transformation and integration tasks to ensure data from various sources are accurately processed and made ready for annotation projects and for delivery to clients per client requirements.
Ensure efficient data storage, retrieval, and optimization to support data annotation workflows.
Implement data quality checks and validation processes to ensure the accuracy, consistency, and integrity of data used in annotation projects.
Collaborate with the Data Engineering team to design and implement robust and scalable data pipelines for importing and exporting data used in our data annotation projects.
Identify and address performance bottlenecks in data pipelines to enhance the speed and efficiency of data import and export processes.
Continuously seek opportunities to automate manual processes and improve data annotation workflows for increased productivity.
Work closely with cross-functional teams, including data annotation teams, backend developers, and project managers, to understand project requirements and provide timely data support.
Maintain comprehensive documentation of data pipelines, processes, and data structures to facilitate knowledge sharing and seamless project handovers.
Address and resolve data-related issues, providing technical support to data annotation teams when required.
Stay abreast of industry trends, tools, and technologies related to data engineering, and propose innovative solutions for data annotation projects.
Requirements
Bachelor’s degree in computer science, Data Engineering, or a related field.
Proven experience as a Data Engineer, with 4 years of hands-on experience in data pipeline design, data transformation, pipeline orchestration, and data integration, particularly for unstructured and semi-structured data.
Proficiency in programming languages such as Python, SQL, or Scala, and experience with data manipulation libraries and frameworks.
Experience with AirFlow, N8N is a plus.
Experience with Ruby is a big plus.
Knowledge and experience with machine learning projects is a big plus.
Solid knowledge of data storage and database management systems, including relational and NoSQL databases.
Familiarity with data visualization tools and techniques to facilitate data understanding and analysis.
Experience with AWS QuickSight and AWS Athena is a plus.
Solid understanding of data quality and data governance principles.
Familiarity with Data Lake concepts and with Apache Iceberg.
Experience with cloud-based data platforms, such as AWS, GCP, or Azure, is a plus.
Strong problem-solving skills with a keen eye for detail.
Excellent communication and collaboration skills, with the ability to work effectively in a team-oriented environment.
A passion for data engineering and a desire to contribute to impactful data annotation projects.
Benefits
LXT is an equal opportunity employer and ensures that no applicant is subject to less favorable treatment on the grounds of gender, gender identity, marital status, race, color, nationality, ethnicity, age, sexual orientation, socio-economic, responsibilities for dependents, or physical or mental disability.
Any hiring decision is made on the basis of skills, qualifications, and experiences.
We measure our success as a business, not only by delivering great products and services and continually increasing our assets under administration and market share but also by how we positively impact people, society, and the planet.
Data Engineer/Analyst maintaining and improving data infrastructure for Braiins. Collaborating with technical and business teams to ensure reliable data flows and insights.
Medior Data Engineer handling Azure migrations for a major urban mobility client. Focused on data pipeline development and ensuring platform reliability with cutting - edge technologies.
Developing ML and computer vision solutions for cutting - edge autonomous vehicle dataset pipeline at Mobileye. Collaborating across teams for data curation and advanced perception algorithms.
Data Migration Lead in a hybrid role managing data migration for a major transformation programme in the media sector. Collaborating with various teams to ensure data integrity and successful migration.
Consultant ML & DataOps at Smile integrating data science projects for major clients. Designing MLOps solutions and enhancing data governance in a collaborative environment.
Data Engineer developing and maintaining data pipelines for Coolbet’s analytical services. Working within an Agile framework to ensure data reliability and efficiency.
API Data Engineer developing innovative data - driven solutions and advancing data architecture for AI Control Tower. Building and integrating APIs and data pipelines to support organizational needs.
Journeyman Data Architect supporting Leidos' enterprise data and analytics program for the Department of War. Collaborating on solutions for data architecture, cloud environments, and governance.
Senior Software Engineer developing backend services and data infrastructure for integrated products at Booz Allen. Collaborating with a small elite team to deliver reliable and scalable services.
AWS Streaming Data Engineer developing software and systems in a fast, agile environment. Utilizing experience with real - time data ingestion and processing systems across distributed environments.