Table of Contents
Fetching ...

Maintainability Challenges in ML: A Systematic Literature Review

Karthik Shivashankar, Antonio Martini

TL;DR

The paper identifies pervasive maintainability challenges in ML systems by conducting a systematic literature review of over 13,000 studies and analyzing 56 in depth. It categorizes challenges across data engineering and model engineering and maps 13 interdependent stages of the ML life cycle to maintainability outcomes, revealing a Repetitive Maintenance anti-pattern where issues in one stage trigger repeated fixes across others. The work delivers actionable implications for ML tool developers and researchers, including the need for provenance, governance, and standardized pipelines to improve maintainability. By clarifying how data dependency, drift, testing, monitoring, and deployment interact, the study provides a roadmap to build more maintainable ML systems with reduced long-term maintenance cost.

Abstract

Background: As Machine Learning (ML) advances rapidly in many fields, it is being adopted by academics and businesses alike. However, ML has a number of different challenges in terms of maintenance not found in traditional software projects. Identifying what causes these maintainability challenges can help mitigate them early and continue delivering value in the long run without degrading ML performance. Aim: This study aims to identify and synthesise the maintainability challenges in different stages of the ML workflow and understand how these stages are interdependent and impact each other's maintainability. Method: Using a systematic literature review, we screened more than 13000 papers, then selected and qualitatively analysed 56 of them. Results: (i) a catalogue of maintainability challenges in different stages of Data Engineering, Model Engineering workflows and the current challenges when building ML systems are discussed; (ii) a map of 13 maintainability challenges to different interdependent stages of ML that impact the overall workflow; (iii) Provided insights to developers of ML tools and researchers. Conclusions: In this study, practitioners and organisations will learn about maintainability challenges and their impact at different stages of ML workflow. This will enable them to avoid pitfalls and help to build a maintainable ML system. The implications and challenges will also serve as a basis for future research to strengthen our understanding of the ML system's maintainability.

Maintainability Challenges in ML: A Systematic Literature Review

TL;DR

The paper identifies pervasive maintainability challenges in ML systems by conducting a systematic literature review of over 13,000 studies and analyzing 56 in depth. It categorizes challenges across data engineering and model engineering and maps 13 interdependent stages of the ML life cycle to maintainability outcomes, revealing a Repetitive Maintenance anti-pattern where issues in one stage trigger repeated fixes across others. The work delivers actionable implications for ML tool developers and researchers, including the need for provenance, governance, and standardized pipelines to improve maintainability. By clarifying how data dependency, drift, testing, monitoring, and deployment interact, the study provides a roadmap to build more maintainable ML systems with reduced long-term maintenance cost.

Abstract

Background: As Machine Learning (ML) advances rapidly in many fields, it is being adopted by academics and businesses alike. However, ML has a number of different challenges in terms of maintenance not found in traditional software projects. Identifying what causes these maintainability challenges can help mitigate them early and continue delivering value in the long run without degrading ML performance. Aim: This study aims to identify and synthesise the maintainability challenges in different stages of the ML workflow and understand how these stages are interdependent and impact each other's maintainability. Method: Using a systematic literature review, we screened more than 13000 papers, then selected and qualitatively analysed 56 of them. Results: (i) a catalogue of maintainability challenges in different stages of Data Engineering, Model Engineering workflows and the current challenges when building ML systems are discussed; (ii) a map of 13 maintainability challenges to different interdependent stages of ML that impact the overall workflow; (iii) Provided insights to developers of ML tools and researchers. Conclusions: In this study, practitioners and organisations will learn about maintainability challenges and their impact at different stages of ML workflow. This will enable them to avoid pitfalls and help to build a maintainable ML system. The implications and challenges will also serve as a basis for future research to strengthen our understanding of the ML system's maintainability.
Paper Structure (21 sections, 2 figures, 1 table)

This paper contains 21 sections, 2 figures, 1 table.

Figures (2)

  • Figure 1: Systematic Literature Review Process
  • Figure 2: Mapping interdependence of Maintainability challenges in different stages of the ML life-cycle, refer Table 1 for relations