Table of Contents
Fetching ...

Go Static: Contextualized Logging Statement Generation

Yichen Li, Yintong Huo, Renyi Zhong, Zhihan Jiang, Jinyang Liu, Junjie Huang, Jiazhen Gu, Pinjia He, Michael R. Lyu

TL;DR

SCLogger addresses the limitations of single-method context in automated logging statement generation by leveraging inter-method static contexts. It introduces a four-phase pipeline—static scope extension, logging style adaptation, contextualized prompt construction, and logging variable refinement—to produce accurate logging locations and high-quality statements via large language models. By incorporating code and log slices, variable sets, and project-specific style examples, SCLogger outperforms state-of-the-art baselines across multiple metrics on ten open-source Java projects and generalizes across backbone models including GPT-4, GPT-3.5, and Llama-2-70b. The approach demonstrates practical benefits in cost, IDE integration, and applicability to diverse languages with promising avenues for future refinement.

Abstract

Logging practices have been extensively investigated to assist developers in writing appropriate logging statements for documenting software behaviors. Although numerous automatic logging approaches have been proposed, their performance remains unsatisfactory due to the constraint of the single-method input, without informative programming context outside the method. Specifically, we identify three inherent limitations with single-method context: limited static scope of logging statements, inconsistent logging styles, and missing type information of logging variables. To tackle these limitations, we propose SCLogger, the first contextualized logging statement generation approach with inter-method static contexts. First, SCLogger extracts inter-method contexts with static analysis to construct the contextualized prompt for language models to generate a tentative logging statement. The contextualized prompt consists of an extended static scope and sampled similar methods, ordered by the chain-of-thought (COT) strategy. Second, SCLogger refines the access of logging variables by formulating a new refinement prompt for language models, which incorporates detailed type information of variables in the tentative logging statement. The evaluation results show that SCLogger surpasses the state-of-the-art approach by 8.7% in logging position accuracy, 32.1% in level accuracy, 19.6% in variable precision, and 138.4% in text BLEU-4 score. Furthermore, SCLogger consistently boosts the performance of logging statement generation across a range of large language models, thereby showcasing the generalizability of this approach.

Go Static: Contextualized Logging Statement Generation

TL;DR

SCLogger addresses the limitations of single-method context in automated logging statement generation by leveraging inter-method static contexts. It introduces a four-phase pipeline—static scope extension, logging style adaptation, contextualized prompt construction, and logging variable refinement—to produce accurate logging locations and high-quality statements via large language models. By incorporating code and log slices, variable sets, and project-specific style examples, SCLogger outperforms state-of-the-art baselines across multiple metrics on ten open-source Java projects and generalizes across backbone models including GPT-4, GPT-3.5, and Llama-2-70b. The approach demonstrates practical benefits in cost, IDE integration, and applicability to diverse languages with promising avenues for future refinement.

Abstract

Logging practices have been extensively investigated to assist developers in writing appropriate logging statements for documenting software behaviors. Although numerous automatic logging approaches have been proposed, their performance remains unsatisfactory due to the constraint of the single-method input, without informative programming context outside the method. Specifically, we identify three inherent limitations with single-method context: limited static scope of logging statements, inconsistent logging styles, and missing type information of logging variables. To tackle these limitations, we propose SCLogger, the first contextualized logging statement generation approach with inter-method static contexts. First, SCLogger extracts inter-method contexts with static analysis to construct the contextualized prompt for language models to generate a tentative logging statement. The contextualized prompt consists of an extended static scope and sampled similar methods, ordered by the chain-of-thought (COT) strategy. Second, SCLogger refines the access of logging variables by formulating a new refinement prompt for language models, which incorporates detailed type information of variables in the tentative logging statement. The evaluation results show that SCLogger surpasses the state-of-the-art approach by 8.7% in logging position accuracy, 32.1% in level accuracy, 19.6% in variable precision, and 138.4% in text BLEU-4 score. Furthermore, SCLogger consistently boosts the performance of logging statement generation across a range of large language models, thereby showcasing the generalizability of this approach.
Paper Structure (34 sections, 1 equation, 8 figures, 4 tables)

This paper contains 34 sections, 1 equation, 8 figures, 4 tables.

Figures (8)

  • Figure 1: Motivating example 1. The origin logging statement is highlighted in the green area while the invocation points are highlighted in the orange area.
  • Figure 2: Motivating example 2. The origin logging statement is highlighted in the green area while the logging statements in the similar methods are highlighted in the orange area.
  • Figure 3: Motivating example 3. The origin logging statement is highlighted in the green area while the corresponding logging variable is highlighted in the orange area.
  • Figure 4: The overview workflow of SCLogger.
  • Figure 5: The log slice and code slice example of SCLogger. The target method is highlighted in the red area.
  • ...and 3 more figures