Table of Contents
Fetching ...

Build Code Needs Maintenance Too: A Study on Refactoring and Technical Debt in Build Systems

Anwar Ghammam, Dhia Elhaq Rzig, Mohamed Almukhtar, Rania Khalsi, Foyzul Hassan, Marouane Kessentini

TL;DR

The paper investigates refactoring and technical debt in build systems by empirically analyzing 725 build-related commits from Gradle, Ant, and Maven, and by proposing a taxonomy of 24 build refactorings organized into 6 main categories. It links many refactoring types to five technical debt categories via manual commit analysis and a developer survey, providing empirical evidence on how refactorings address TD in build scripts. To aid future work, the authors introduce BuildRefMiner, an LLM-powered tool using GPT-4o that detects build refactorings from diffs, achieving strong performance improvements when using one-shot prompting. The findings offer a structured, technology-agnostic view of build script maintenance, with practical implications for practitioners and researchers and a dataset that can fuel further development of automated build system tooling.

Abstract

In modern software engineering, build systems play the crucial role of facilitating the conversion of source code into software artifacts. Recent research has explored high-level causes of build failures, but has largely overlooked the structural properties of build files. Akin to source code, build systems face technical debt challenges that hinder maintenance and optimization. While refactoring is often seen as a key tool for addressing technical debt in source code, there is a significant research gap regarding the specific refactoring changes developers apply to build code and whether these refactorings effectively address technical debt. In this paper, we address this gap by examining refactorings applied to build scripts in open-source projects, covering the widely used build systems of Gradle, Ant, and Maven. Additionally, we investigate whether these refactorings are used to tackle technical debts in build systems. Our analysis was conducted on \totalCommits examined build-file-related commits. We identified \totalRefactoringCategories build-related refactorings, which we divided into \totalCategories main categories. These refactorings are organized into the first empirically derived taxonomy of build system refactorings. Furthermore, we investigate how developers employ these refactoring types to address technical debts via a manual commit-analysis and a developer survey. In this context, we identified \totalTechnicalDebts technical debts addressed by these refactorings and discussed their correlation with the different refactorings. Finally, we introduce BuildRefMiner, an LLM-powered tool leveraging GPT-4o to automate the detection of refactorings within build systems. We evaluated its performance and found that it achieves an F1 score of \toolFoneScore across all build systems.

Build Code Needs Maintenance Too: A Study on Refactoring and Technical Debt in Build Systems

TL;DR

The paper investigates refactoring and technical debt in build systems by empirically analyzing 725 build-related commits from Gradle, Ant, and Maven, and by proposing a taxonomy of 24 build refactorings organized into 6 main categories. It links many refactoring types to five technical debt categories via manual commit analysis and a developer survey, providing empirical evidence on how refactorings address TD in build scripts. To aid future work, the authors introduce BuildRefMiner, an LLM-powered tool using GPT-4o that detects build refactorings from diffs, achieving strong performance improvements when using one-shot prompting. The findings offer a structured, technology-agnostic view of build script maintenance, with practical implications for practitioners and researchers and a dataset that can fuel further development of automated build system tooling.

Abstract

In modern software engineering, build systems play the crucial role of facilitating the conversion of source code into software artifacts. Recent research has explored high-level causes of build failures, but has largely overlooked the structural properties of build files. Akin to source code, build systems face technical debt challenges that hinder maintenance and optimization. While refactoring is often seen as a key tool for addressing technical debt in source code, there is a significant research gap regarding the specific refactoring changes developers apply to build code and whether these refactorings effectively address technical debt. In this paper, we address this gap by examining refactorings applied to build scripts in open-source projects, covering the widely used build systems of Gradle, Ant, and Maven. Additionally, we investigate whether these refactorings are used to tackle technical debts in build systems. Our analysis was conducted on \totalCommits examined build-file-related commits. We identified \totalRefactoringCategories build-related refactorings, which we divided into \totalCategories main categories. These refactorings are organized into the first empirically derived taxonomy of build system refactorings. Furthermore, we investigate how developers employ these refactoring types to address technical debts via a manual commit-analysis and a developer survey. In this context, we identified \totalTechnicalDebts technical debts addressed by these refactorings and discussed their correlation with the different refactorings. Finally, we introduce BuildRefMiner, an LLM-powered tool leveraging GPT-4o to automate the detection of refactorings within build systems. We evaluated its performance and found that it achieves an F1 score of \toolFoneScore across all build systems.

Paper Structure

This paper contains 28 sections, 11 figures, 2 tables.

Figures (11)

  • Figure 1: Approach Overview
  • Figure 2: BuildRefMiner Prompt
  • Figure 3: Build-related refactorings taxonomy
  • Figure 4: Developer Response on Commit dm0
  • Figure 5: Developer Response of Commit dm1
  • ...and 6 more figures