Table of Contents
Fetching ...

Optimization is Better than Generation: Optimizing Commit Message Leveraging Human-written Commit Message

Jiawei Li, David Faragó, Christian Petrov, Iftekhar Ahmed

TL;DR

This work addresses the gap in commit message generation where automated CMG methods miss crucial human-relevant contexts. It introduces Commit Message Optimization (CMO), an approach that starts from a human-written commit message and iteratively refines it using LLMs guided by external evaluators and retrieved software contexts. Through identifying eight missing context themes (RQ1) and applying a context-aware optimization framework, CMO surpasses state-of-the-art CMG methods (OMG, CMC) and often outperforms human-written messages on key quality dimensions (Rationality, Comprehensiveness, Expressiveness). The findings demonstrate the value of human-guided, context-driven optimization for software maintenance communication and suggest avenues for extending the approach to related SE tasks.

Abstract

Commit messages are crucial in software development, supporting maintenance tasks and communication among developers. While Large Language Models (LLMs) have advanced Commit Message Generation (CMG) using various software contexts, some contexts developers consider are often missed by CMG techniques and can't be easily retrieved or even retrieved at all by automated tools. To address this, we propose Commit Message Optimization (CMO), which enhances human-written messages by leveraging LLMs and search-based optimization. CMO starts with human-written messages and iteratively improves them by integrating key contexts and feedback from external evaluators. Our extensive evaluation shows CMO generates commit messages that are significantly more Rational, Comprehensive, and Expressive while outperforming state-of-the-art CMG methods and human messages 88.2%-95.4% of the time.

Optimization is Better than Generation: Optimizing Commit Message Leveraging Human-written Commit Message

TL;DR

This work addresses the gap in commit message generation where automated CMG methods miss crucial human-relevant contexts. It introduces Commit Message Optimization (CMO), an approach that starts from a human-written commit message and iteratively refines it using LLMs guided by external evaluators and retrieved software contexts. Through identifying eight missing context themes (RQ1) and applying a context-aware optimization framework, CMO surpasses state-of-the-art CMG methods (OMG, CMC) and often outperforms human-written messages on key quality dimensions (Rationality, Comprehensiveness, Expressiveness). The findings demonstrate the value of human-guided, context-driven optimization for software maintenance communication and suggest avenues for extending the approach to related SE tasks.

Abstract

Commit messages are crucial in software development, supporting maintenance tasks and communication among developers. While Large Language Models (LLMs) have advanced Commit Message Generation (CMG) using various software contexts, some contexts developers consider are often missed by CMG techniques and can't be easily retrieved or even retrieved at all by automated tools. To address this, we propose Commit Message Optimization (CMO), which enhances human-written messages by leveraging LLMs and search-based optimization. CMO starts with human-written messages and iteratively improves them by integrating key contexts and feedback from external evaluators. Our extensive evaluation shows CMO generates commit messages that are significantly more Rational, Comprehensive, and Expressive while outperforming state-of-the-art CMG methods and human messages 88.2%-95.4% of the time.
Paper Structure (26 sections, 2 equations, 3 figures, 6 tables, 1 algorithm)