Table of Contents
Fetching ...

Green Architectural Tactics in ML-enabled Systems: An LLM-based Repository Mining Study

Vincenzo De Martino, Silverio Martínez-Fernández, Fabio Palomba

Abstract

Context: The increasing adoption of machine learning (ML) and artificial intelligence (AI) technologies raises growing concerns about their environmental sustainability. Developing and deploying ML-enabled systems is computationally intensive, particularly during training and inference. Green AI has emerged to address these issues by promoting efficiency without sacrificing accuracy. While prior research has proposed catalogs of sustainable practices (i.e., green tactics), there remains limited understanding of their adoption in practice and whether additional, undocumented tactics exist. Objective: This study aims to investigate the extent to which existing sustainable practices are implemented in real-world ML-enabled systems and to identify previously undocumented practices that support environmental sustainability. Method: We conduct a mining software repository study on 205 open-source ML projects on GitHub. To support our analysis, we design a novel mechanism based on large language models (LLMs) capable of identifying both known and new sustainable practices from code repositories. Results: Our findings confirm that green tactics reported in the literature are used in practice, although adoption rates vary. Furthermore, our LLM-based approach reveals nine previously undocumented sustainable practices. Each tactic is supported with code examples to aid adoption and integration. Conclusions: We finally provide insights for practitioners seeking to reduce the environmental impact of ML-enabled systems and offer a foundation for future research in automating the detection and adoption of sustainable practices.

Green Architectural Tactics in ML-enabled Systems: An LLM-based Repository Mining Study

Abstract

Context: The increasing adoption of machine learning (ML) and artificial intelligence (AI) technologies raises growing concerns about their environmental sustainability. Developing and deploying ML-enabled systems is computationally intensive, particularly during training and inference. Green AI has emerged to address these issues by promoting efficiency without sacrificing accuracy. While prior research has proposed catalogs of sustainable practices (i.e., green tactics), there remains limited understanding of their adoption in practice and whether additional, undocumented tactics exist. Objective: This study aims to investigate the extent to which existing sustainable practices are implemented in real-world ML-enabled systems and to identify previously undocumented practices that support environmental sustainability. Method: We conduct a mining software repository study on 205 open-source ML projects on GitHub. To support our analysis, we design a novel mechanism based on large language models (LLMs) capable of identifying both known and new sustainable practices from code repositories. Results: Our findings confirm that green tactics reported in the literature are used in practice, although adoption rates vary. Furthermore, our LLM-based approach reveals nine previously undocumented sustainable practices. Each tactic is supported with code examples to aid adoption and integration. Conclusions: We finally provide insights for practitioners seeking to reduce the environmental impact of ML-enabled systems and offer a foundation for future research in automating the detection and adoption of sustainable practices.
Paper Structure (26 sections, 5 figures, 6 tables)

This paper contains 26 sections, 5 figures, 6 tables.

Figures (5)

  • Figure 1: Overview of the research process.
  • Figure 2: Distribution of Project Metrics Across Categories.
  • Figure 3: The prompt designed to extract architecture green tactics from software repositories.
  • Figure 4: Frequency of Green Tactics in ML Projects.
  • Figure 5: Catalog of 39 Green Architectural Tactics for ML-Enabled Systems, adopted from 10.1145/3639475.3640111 and enhanced with use frequency of all tactics and the elicitation of nine new tactics. The symbol '*' indicates a new tactic found by our LLM-based mechanism. Colors indicate frequency using quartiles on 205 ML projects.