An Empirical Study on JIT Defect Prediction Based on BERT-style Model
Yuxiang Guo, Xiaopeng Gao, Bo Jiang
TL;DR
The paper addresses the problem of how fine-tuning configurations of BERT-style models affect JIT defect prediction. It adopts CodeBERT and RoBERTaJIT as backbone models and conducts a systematic empirical study across six projects with a large commit dataset, examining parameter freezing, initialization, feature extractors, and optimizer strategies. Key contributions include identifying the first encoder layer as crucial, showing initialization and weight decay can modestly improve performance, demonstrating that simple FCN extractors can be highly effective, and proposing a LoRA-based, memory-efficient fine-tuning method (OCJITLoRA) with competitive accuracy. The work provides practical guidance for cost-effective deployment of BERT-style models in JIT defect prediction and demonstrates substantial memory savings compared to full fine-tuning.
Abstract
Previous works on Just-In-Time (JIT) defect prediction tasks have primarily applied pre-trained models directly, neglecting the configurations of their fine-tuning process. In this study, we perform a systematic empirical study to understand the impact of the settings of the fine-tuning process on BERT-style pre-trained model for JIT defect prediction. Specifically, we explore the impact of different parameter freezing settings, parameter initialization settings, and optimizer strategies on the performance of BERT-style models for JIT defect prediction. Our findings reveal the crucial role of the first encoder layer in the BERT-style model and the project sensitivity to parameter initialization settings. Another notable finding is that the addition of a weight decay strategy in the Adam optimizer can slightly improve model performance. Additionally, we compare performance using different feature extractors (FCN, CNN, LSTM, transformer) and find that a simple network can achieve great performance. These results offer new insights for fine-tuning pre-trained models for JIT defect prediction. We combine these findings to find a cost-effective fine-tuning method based on LoRA, which achieve a comparable performance with only one-third memory consumption than original fine-tuning process.
