Software Engineer delivering MLOps solutions for Generative AI at DataGalaxy. Focusing on reliability and collaboration with product engineering teams in a hybrid environment.
Responsibilities
Join the AI-Share team to help build and operate the foundations that power our Generative AI features (LLM, RAG, agents) inside the DataGalaxy data governance platform.
This role focuses on MLOps / ModelOps delivery: making GenAI capabilities reliable in production (deployment, monitoring, cost control, traceability), while collaborating with product engineering teams across a polyglot stack.
Contribute to the evolution of our **ModelOps platform** for GenAI: provider integrations, configuration, deployment automation, and operational tooling.
Help implement practical patterns for running GenAI workloads in production: **evaluation**, **versioning**, **reproducibility**, safe rollouts/rollbacks, and environment management.
Build and improve **CI/CD workflows** adapted to AI: packaging, automated checks, evaluation steps (when applicable), deployment, and rollback.
Improve **traceability** of AI assets (configs, prompts/templates when applicable, evaluation outputs, versions) to support governance and debugging.
Add and maintain **observability** for GenAI workloads: latency, availability, usage/cost signals, and quality-related indicators (dashboards/alerts).
Develop and improve **GenAI features** within the platform (agent, RAG pipelines, MCP server): new capabilities, prompt engineering, bug fixes, and client-facing improvements.
Work closely with Product / Data / Engineering to integrate GenAI capabilities into the platform in a maintainable way.
Participate in code reviews, documentation, and post-incident follow-ups (RCA / action items), with guidance from the team.
Requirements
Professional experience delivering production software in **Python** (services, tooling, automation): comfortable reasoning about service design, maintainability, and code quality.
Familiarity with **CI/CD** and shipping changes to production (pipelines, environments, rollbacks, release hygiene).
Comfortable with cloud/production constraints: reading logs/metrics, debugging issues, improving reliability over time.
Comfortable working in a **polyglot environment**: you can read/understand code and interfaces beyond Python, and you have experience with **at least one other language** (e.g., C#, TypeScript, Java/Kotlin, Go).
**Nice-to-have (big plus)**
Comfortable working with **AI-assisted development tools** (e.g. coding agents, copilots) as part of your daily workflow, and open to evolving your practices as these tools mature.
Hands-on exposure to **LLM/RAG/agents** in real projects (prompt/version management, evaluation approaches, basic safety/guardrails).
Familiarity with managed GenAI platforms (Azure AI Foundry / Bedrock / Vertex) or similar services.
Exposure to self-hosted inference servers (e.g. vLLM) and/or multi-model routing solutions (e.g. LiteLLM) and understanding architecture trade-offs with managed providers and for potential on-premises deployments.
Experience with containers and orchestration (Docker, Kubernetes), plus service-to-service patterns.
Infrastructure-as-code experience (Terraform or equivalent).
Observability experience (OpenTelemetry, dashboards/alerting) and cost monitoring.
Familiarity with Python API frameworks (e.g. FastAPI) and software design principles (SOLID, modular architecture, dependency injection).
Prior exposure to parts of our broader stack (e.g., **.NET**, **Angular**) is welcome but not required.
Benefits
Offices in the heart of Lyon (Part Dieu) and Paris (2ème arr.)
Flexible working hours ("forfait jour")
Remote work at will & 2.70€ net per day worked from home
2 weeks of working from anywhere 🌍
Health insurance Apicil covering you and your family
Meal vouchers (Swile card of 9€/day)
Public transport 50% reimbursement, 100% reimbursement for your bike subscription
Holiday Bonus 🏝️
Quarterly team events and seminars
An attractive remuneration according to your performance and your potential
A real opportunity to join a French start-up that is a pioneer in its market 🚀
Machine Learning Engineer developing and deploying AI systems for Personio's HR platform. Collaborating with teams to integrate ML features and products ensuring data privacy and security.
Principal Machine Learning Engineer leading AI and Machine Learning systems at Bumble for recommendations and personalization. Driving improvements in user engagement and safety across Bumble products.
Principal Machine Learning Engineer at Qodea responsible for leading ML model lifecycle and collaborating on AI solutions in Buenos Aires delivery center.
Senior Machine Learning Engineer responsible for designing, building, and deploying ML solutions. Joining a global tech group tackling high - impact projects in Buenos Aires.
Lead ML Ops Engineer for a fast - growing AI startup focused on scalable infrastructure. Drive hands - on execution across the entire model lifecycle in a collaborative environment.
Lead Machine Learning Engineer creating personalized item recommendations for Target.com and the Target App. Designing and optimizing production ML solutions with a team of data scientists and engineers.
Senior Machine Learning Engineer at Doctrine focusing on developing NLP models for legal document processing. Join an ambitious team to innovate within the field of legal technology.
Senior ML Engineer developing scalable production ML systems across various teams in JobCloud. Leading innovation in the AI - driven recruitment landscape, improving job ad visibility and performance.
MLOps Engineer responsible for designing and maintaining ML pipelines at JobCloud. Collaborating with teams to productionize ML models and ensuring robust system performance.
Senior Machine Learning Engineer at greehill developing ML solutions for sustainable urban living. Leading projects in Computer Vision and Deep Learning to transform urban environments.