Maturity Framework for Enhancing Machine Learning Quality
Angelantonio Castelli, Georgios Christos Chouliaras, Dmitri Goldenberg
TL;DR
This work addresses the challenge of ensuring high-quality, reproducible ML systems through a structured Quality Assessment and Maturity Framework. It defines seven quality characteristics and a quantified scoring approach, supported by an open-source Python package and an ML Registry for automated, scalable evaluation. The framework is validated in a Booking.com deployment, demonstrating measurable improvements in quality and business outcomes, and is presented as a practical pathway to stronger ML governance. The authors argue that this approach can reshape industry standards and be extended to evolving ML paradigms such as GenAI, with ongoing refinements to tooling and domain-specific criteria.
Abstract
With the rapid integration of Machine Learning (ML) in business applications and processes, it is crucial to ensure the quality, reliability and reproducibility of such systems. We suggest a methodical approach towards ML system quality assessment and introduce a structured Maturity framework for governance of ML. We emphasize the importance of quality in ML and the need for rigorous assessment, driven by issues in ML governance and gaps in existing frameworks. Our primary contribution is a comprehensive open-sourced quality assessment method, validated with empirical evidence, accompanied by a systematic maturity framework tailored to ML systems. Drawing from applied experience at Booking.com, we discuss challenges and lessons learned during large-scale adoption within organizations. The study presents empirical findings, highlighting quality improvement trends and showcasing business outcomes. The maturity framework for ML systems, aims to become a valuable resource to reshape industry standards and enable a structural approach to improve ML maturity in any organization.
