Table of Contents
Fetching ...

An Empirical Study on JIT Defect Prediction Based on BERT-style Model

Yuxiang Guo, Xiaopeng Gao, Bo Jiang

TL;DR

The paper addresses the problem of how fine-tuning configurations of BERT-style models affect JIT defect prediction. It adopts CodeBERT and RoBERTaJIT as backbone models and conducts a systematic empirical study across six projects with a large commit dataset, examining parameter freezing, initialization, feature extractors, and optimizer strategies. Key contributions include identifying the first encoder layer as crucial, showing initialization and weight decay can modestly improve performance, demonstrating that simple FCN extractors can be highly effective, and proposing a LoRA-based, memory-efficient fine-tuning method (OCJITLoRA) with competitive accuracy. The work provides practical guidance for cost-effective deployment of BERT-style models in JIT defect prediction and demonstrates substantial memory savings compared to full fine-tuning.

Abstract

Previous works on Just-In-Time (JIT) defect prediction tasks have primarily applied pre-trained models directly, neglecting the configurations of their fine-tuning process. In this study, we perform a systematic empirical study to understand the impact of the settings of the fine-tuning process on BERT-style pre-trained model for JIT defect prediction. Specifically, we explore the impact of different parameter freezing settings, parameter initialization settings, and optimizer strategies on the performance of BERT-style models for JIT defect prediction. Our findings reveal the crucial role of the first encoder layer in the BERT-style model and the project sensitivity to parameter initialization settings. Another notable finding is that the addition of a weight decay strategy in the Adam optimizer can slightly improve model performance. Additionally, we compare performance using different feature extractors (FCN, CNN, LSTM, transformer) and find that a simple network can achieve great performance. These results offer new insights for fine-tuning pre-trained models for JIT defect prediction. We combine these findings to find a cost-effective fine-tuning method based on LoRA, which achieve a comparable performance with only one-third memory consumption than original fine-tuning process.

An Empirical Study on JIT Defect Prediction Based on BERT-style Model

TL;DR

The paper addresses the problem of how fine-tuning configurations of BERT-style models affect JIT defect prediction. It adopts CodeBERT and RoBERTaJIT as backbone models and conducts a systematic empirical study across six projects with a large commit dataset, examining parameter freezing, initialization, feature extractors, and optimizer strategies. Key contributions include identifying the first encoder layer as crucial, showing initialization and weight decay can modestly improve performance, demonstrating that simple FCN extractors can be highly effective, and proposing a LoRA-based, memory-efficient fine-tuning method (OCJITLoRA) with competitive accuracy. The work provides practical guidance for cost-effective deployment of BERT-style models in JIT defect prediction and demonstrates substantial memory savings compared to full fine-tuning.

Abstract

Previous works on Just-In-Time (JIT) defect prediction tasks have primarily applied pre-trained models directly, neglecting the configurations of their fine-tuning process. In this study, we perform a systematic empirical study to understand the impact of the settings of the fine-tuning process on BERT-style pre-trained model for JIT defect prediction. Specifically, we explore the impact of different parameter freezing settings, parameter initialization settings, and optimizer strategies on the performance of BERT-style models for JIT defect prediction. Our findings reveal the crucial role of the first encoder layer in the BERT-style model and the project sensitivity to parameter initialization settings. Another notable finding is that the addition of a weight decay strategy in the Adam optimizer can slightly improve model performance. Additionally, we compare performance using different feature extractors (FCN, CNN, LSTM, transformer) and find that a simple network can achieve great performance. These results offer new insights for fine-tuning pre-trained models for JIT defect prediction. We combine these findings to find a cost-effective fine-tuning method based on LoRA, which achieve a comparable performance with only one-third memory consumption than original fine-tuning process.
Paper Structure (13 sections, 1 equation, 3 figures, 9 tables)

This paper contains 13 sections, 1 equation, 3 figures, 9 tables.

Figures (3)

  • Figure 1: Various methods adjustable in the fine-tuning process of the BERT-style JIT defect prediction models.
  • Figure 2: AUC change with different freezing settings in RoBERTaJIT and CodeBERTJIT
  • Figure 3: AUC and F1 score change on different parameter reinitialization settings for RoBERTaJIT and CodeBERTJIT