Table of Contents
Fetching ...

Are your comments outdated? Towards automatically detecting code-comment consistency

Yuan Huang, Yinan Chen, Xiangping Chen, Xiaocong Zhou

TL;DR

This work proposes a learning-based method, called CoCC, to detect the consistency between code and comment, and shows that CoCC can effectively detect outdated comments with precision over 90%.

Abstract

In software development and maintenance, code comments can help developers understand source code, and improve communication among developers. However, developers sometimes neglect to update the corresponding comment when changing the code, resulting in outdated comments (i.e., inconsistent codes and comments). Outdated comments are dangerous and harmful and may mislead subsequent developers. More seriously, the outdated comments may lead to a fatal flaw sometime in the future. To automatically identify the outdated comments in source code, we proposed a learning-based method, called CoCC, to detect the consistency between code and comment. To efficiently identify outdated comments, we extract multiple features from both codes and comments before and after they change. Besides, we also consider the relation between code and comment in our model. Experiment results show that CoCC can effectively detect outdated comments with precision over 90%. In addition, we have identified the 15 most important factors that cause outdated comments, and verified the applicability of CoCC in different programming languages. We also used CoCC to find outdated comments in the latest commits of open source projects, which further proves the effectiveness of the proposed method.

Are your comments outdated? Towards automatically detecting code-comment consistency

TL;DR

This work proposes a learning-based method, called CoCC, to detect the consistency between code and comment, and shows that CoCC can effectively detect outdated comments with precision over 90%.

Abstract

In software development and maintenance, code comments can help developers understand source code, and improve communication among developers. However, developers sometimes neglect to update the corresponding comment when changing the code, resulting in outdated comments (i.e., inconsistent codes and comments). Outdated comments are dangerous and harmful and may mislead subsequent developers. More seriously, the outdated comments may lead to a fatal flaw sometime in the future. To automatically identify the outdated comments in source code, we proposed a learning-based method, called CoCC, to detect the consistency between code and comment. To efficiently identify outdated comments, we extract multiple features from both codes and comments before and after they change. Besides, we also consider the relation between code and comment in our model. Experiment results show that CoCC can effectively detect outdated comments with precision over 90%. In addition, we have identified the 15 most important factors that cause outdated comments, and verified the applicability of CoCC in different programming languages. We also used CoCC to find outdated comments in the latest commits of open source projects, which further proves the effectiveness of the proposed method.
Paper Structure (21 sections, 17 equations, 13 figures, 19 tables, 2 algorithms)

This paper contains 21 sections, 17 equations, 13 figures, 19 tables, 2 algorithms.

Figures (13)

  • Figure 1: Comment contains a description of the code that was removed in jEdit commit #13416.
  • Figure 2: Comment that lacks a description of the new code in JAMWiki commit #304.
  • Figure 3: Comment that refers to variables that do not exist in EJBCA commit #4977.
  • Figure 4: Method-type comment and block-type comment.
  • Figure 5: Two block-type code-comment pair extracted from FIGURE \ref{['commenttype']}.
  • ...and 8 more figures