Data Engineer with AI/ML at Pulte Mortgage focused on enhancing data-driven culture and infrastructure. Responsibilities include designing data pipelines, collaborating with data scientists, and ensuring data quality.
Responsibilities
Design new and improve existing data infrastructures, including the Lakehouse, data warehouses, dataflows, data pipelines, semantic models, and reports
Migrate large-scale data stores from the existing on-premises SQL Server infrastructure to the new Microsoft Fabric-based infrastructure
Classify and organize data based on identified taxonomy structures
Work with our enterprise and data architects to ensure that the data is of high quality and meets the organization’s requirements
Optimize data processing by using modern data engineering tools such as notebooks, dataflows, data pipelines, semantic models, and reports
Provide technical expertise during the design, planning, development, implementation, and testing of digital solutions, often custom developed and integrating new technologies
Understand technological systems and strategic vision and help facilitate the technical portion to produce integrated end-to-end digital solution options
Experiment and find ways to use AI and ML to improve our processes or deliver business impact
Collaborate with data scientists to productionize ML models and integrate them into data pipelines
Participate in cross-project planning and release planning activities
Write and maintain concise documentation about our development process and major systems
Build scalable, maintainable, easy-to-use software following our development best practices and requirements laid out by the architect and the development team
Collaborate with product owners and end-users to understand any desired business functionality
Regularly review application logs and dashboards to proactively monitor for defects, gauge performance, and troubleshoot production problems
Contribute to Pulte Financial Services’ positive, trusting, inclusive culture and team-first environment
Requirements
Minimum high school diploma or equivalent (GED)
Bachelor's Degree in Computer Science or related field highly preferred
4+ years’ software engineering experience with Python, PySpark, Spark, or equivalent notebook programming
3+ years’ experience with XML, SQL, relational databases, and large data repositories
Preferred experience building solutions within Microsoft's Azure cloud environment, specifically Microsoft Fabric; or willingness to learn and adopt new cloud-native data platforms
Hands-on experience with data platform technologies such as Kafka, Hadoop, or Spark, but preferably those in the Azure platform such as HD Insight, Synapse, Data Lake, and Data Factory
Excellent relational database skills in writing SQL, ETL processes, analyzing and optimizing query plans, and writing DDL scripts
Passion for data and data quality
Passion for building clean and testable code, creating unit tests, and focusing on code quality
Extensive knowledge and experience with PowerBI or other widely used data solutions
Highly self-motivated and directed with a strong sense of curiosity and drive to accomplish goals and support the data product team
Experience with AI, ML, Agents and other automation tools is a huge plus
Experience with ML frameworks (e.g., scikit-learn, TensorFlow, Azure ML) is a plus
Experience with API and integration concepts
Knowledge in data pipelines, CI/CD concepts, DataOps/MLOps, and general software deployment lifecycles for continuous integration, delivery, and monitoring
Exceptional verbal and written communication and collaboration skills, with the ability to interact effectively with a wide range of technical and non-technical stakeholders
Participant in Agile methodologies, particularly Scrum, and a track record of successful product delivery
Benefits
Up to 9 paid company holidays per year
Up to 6 days of sick pay
Up to 17 PTO days per year (and up to 22 PTO days per year upon 10 or more years of service)
Eligible to participate in the Company’s 401(k) Plan
Medical, dental, and vision insurance coverage
Company-paid disability, basic life insurance, and parental leave
Voluntary insurance coverage options, including critical illness, accident, and hospital indemnity
Senior Data Engineer building and maintaining robust data pipelines for various data products at Beep Saúde. Collaborating within the team and leading data governance practices.
Software Developer in Test working on cloud - based data platform at Tecsys. Ensuring quality and reliability of data pipelines and transformations using automation frameworks.
Data Engineer responsible for designing, building, and optimizing data pipelines and architectures in a tech environment. Requires extensive experience with modern data warehousing and cloud platforms.
Lead Data Engineer role at Brillio focusing on AI & Data Engineering with expertise in Azure and MS Fabric. Collaborate within the Data Engineering team in Pune, Maharashtra, India.
Data Architect at Whiteshield designing scalable, secure data architectures for national and enterprise transformation programs. Architecting modern data platforms to support analytics, AI and operational use cases.
Data Engineer managing scalable data ecosystems for actionable business intelligence and cross - functional stakeholder collaboration. Optimizing ETL/ELT pipelines and ensuring data integrity and security.
Data Engineer specializing in data architecture and solutions for a banking environment, driving value for customers through innovative engineering practices and technologies in data management.
Technical Lead for data engineering and reporting in healthcare technology at Dedalus. Shaping innovative software solutions and leading cross - functional technical teams in Australia.
Senior ML Data Engineer working on data pipeline curation for Mobileye's autonomous vehicle dataset. Collaborating across teams to enhance ML engineering and vision model applications.
Data Engineer managing customer datasets to enhance industrial research and development. Responsible for ETL pipelines and data ingestion for the Uncountable Web Platform.