Table of Contents
Fetching ...

Consider What Humans Consider: Optimizing Commit Message Leveraging Contexts Considered By Human

Jiawei Li, David Faragó, Christian Petrov, Iftekhar Ahmed

TL;DR

The paper tackles the problem that automated commit message generation often misses nuanced human contexts that drive high-quality messages. It introduces Commit Message Optimization (CMO), an approach that starts from a human-written message (or a blank inception) and iteratively refines it using LLMs guided by both automated context retrieval and retrieval-based similarity evaluators. Through qualitative analysis, it identifies seven software context themes commonly used by humans but missed by state-of-the-art CMG like OMG, and demonstrates that CMO significantly improves Rationality, Comprehensiveness, and Expressiveness over baselines while sometimes lagging in Conciseness. The findings show that CMO can augment existing CMG methods and generate high-quality messages from scratch, offering practical benefits for software maintenance and collaboration, with future work aimed at extending human-guided optimization to other SE tasks.

Abstract

Commit messages are crucial in software development, supporting maintenance tasks and communication among developers. While Large Language Models (LLMs) have advanced Commit Message Generation (CMG) using various software contexts, some contexts developers consider to write high-quality commit messages are often missed by CMG techniques and can't be easily retrieved or even retrieved at all by automated tools. To address this, we propose Commit Message Optimization (CMO), which enhances human-written messages by leveraging LLMs and search-based optimization. CMO starts with human-written messages and iteratively improves them by integrating key contexts and feedback from external evaluators. Our extensive evaluation shows CMO generates commit messages that are significantly more Rational, Comprehensive, and Expressive while outperforming state-of-the-art CMG methods and human messages 40.3% to 78.4% of the time. Moreover, CMO can support existing CMG techniques to further improve message quality and generate high-quality messages when the human-written ones are left blank.

Consider What Humans Consider: Optimizing Commit Message Leveraging Contexts Considered By Human

TL;DR

The paper tackles the problem that automated commit message generation often misses nuanced human contexts that drive high-quality messages. It introduces Commit Message Optimization (CMO), an approach that starts from a human-written message (or a blank inception) and iteratively refines it using LLMs guided by both automated context retrieval and retrieval-based similarity evaluators. Through qualitative analysis, it identifies seven software context themes commonly used by humans but missed by state-of-the-art CMG like OMG, and demonstrates that CMO significantly improves Rationality, Comprehensiveness, and Expressiveness over baselines while sometimes lagging in Conciseness. The findings show that CMO can augment existing CMG methods and generate high-quality messages from scratch, offering practical benefits for software maintenance and collaboration, with future work aimed at extending human-guided optimization to other SE tasks.

Abstract

Commit messages are crucial in software development, supporting maintenance tasks and communication among developers. While Large Language Models (LLMs) have advanced Commit Message Generation (CMG) using various software contexts, some contexts developers consider to write high-quality commit messages are often missed by CMG techniques and can't be easily retrieved or even retrieved at all by automated tools. To address this, we propose Commit Message Optimization (CMO), which enhances human-written messages by leveraging LLMs and search-based optimization. CMO starts with human-written messages and iteratively improves them by integrating key contexts and feedback from external evaluators. Our extensive evaluation shows CMO generates commit messages that are significantly more Rational, Comprehensive, and Expressive while outperforming state-of-the-art CMG methods and human messages 40.3% to 78.4% of the time. Moreover, CMO can support existing CMG techniques to further improve message quality and generate high-quality messages when the human-written ones are left blank.

Paper Structure

This paper contains 30 sections, 1 equation, 3 figures, 6 tables, 1 algorithm.

Figures (3)

  • Figure 1: Example Commits
  • Figure 2: Overview of Retrieval-based Quality Evaluator
  • Figure 3: Example of Code Refactoring