FastLog: An End-to-End Method to Efficiently Generate and Insert Logging Statements
Xiaoyuan Xie, Zhipeng Cai, Songqiang Chen, Jifeng Xuan
TL;DR
FastLog tackles the problem of efficiently generating and inserting logging statements without altering non-log code. It introduces a two-stage approach: (i) token-level insertion-point prediction and (ii) complete logging-statement generation, implemented with two fine-tuned PLBART models and careful long-input handling via text splitting. Compared to the state-of-the-art LANCE, FastLog achieves substantial speedups (about 12x faster per sample) and improves output quality in terms of insertion accuracy and log-message BLEU/ROUGE scores, while reducing the risk of unwanted code modifications. The method is validated on both original and newly constructed test sets, demonstrating robustness to long inputs and interactive, Just-In-Time usage, with potential extensions to multiple predictions and other languages.
Abstract
Logs play a crucial role in modern software systems, serving as a means for developers to record essential information for future software maintenance. As the performance of these log-based maintenance tasks heavily relies on the quality of logging statements, various works have been proposed to assist developers in writing appropriate logging statements. However, these works either only support developers in partial sub-tasks of this whole activity; or perform with a relatively high time cost and may introduce unwanted modifications. To address their limitations, we propose FastLog, which can support the complete logging statement generation and insertion activity, in a very speedy manner. Specifically, given a program method, FastLog first predicts the insertion position in the finest token level, and then generates a complete logging statement to insert. We further use text splitting for long input texts to improve the accuracy of predicting where to insert logging statements. A comprehensive empirical analysis shows that our method outperforms the state-of-the-art approach in both efficiency and output quality, which reveals its great potential and practicality in current real-time intelligent development environments.
