Empirical Studies of Parameter Efficient Methods for Large Language Models of Code and Knowledge Transfer to R

Amirreza Esmaeili; Iman Saberi; Fatemeh H. Fard

Empirical Studies of Parameter Efficient Methods for Large Language Models of Code and Knowledge Transfer to R

Amirreza Esmaeili, Iman Saberi, Fatemeh H. Fard

TL;DR

The paper conducts a comprehensive empirical comparison of three Parameter Efficient Fine-Tuning methods—LoRA, Compacter, and IA$^3$—across code-focused LLMs (CodeT5, CodeLlama) and general LLMs (T5, Llama2) to perform code summarization and code generation, including knowledge transfer to the unseen language R. It finds that LoRA consistently delivers the best performance across tasks and languages, while Compacter provides substantial resource savings with modest losses, and IA$^3$ is generally outperformed by the other two. The study also shows that increasing the number of trainable parameters has a larger impact on functional accuracy than the choice of PEFT architecture, and it highlights the challenges of executing executable code in R, especially for weaker base models. These insights guide practical decisions on PEFT method selection given computational constraints and underscore the potential for effective knowledge transfer to low-resource languages like R. The work includes open-source scripts and deeper qualitative analyses of code summarization quality, styling, and attention, contributing to both methodology and applied understanding in software engineering with LLMs.

Abstract

Parameter Efficient Fine-Tuning (PEFT) methods are proposed as an alternative fine-tuning approach for Large Language Models (LLM) to minimize high training costs. While prior research demonstrates the effectiveness of PEFT methods in knowledge transfer using smaller language models, their application to larger LLMs, particularly in low-resource and unseen programming languages such as R, remains under-explored. In this work, we evaluate PEFT methods, LoRA, Compacter, and IA^3 on LLMs for code summarization and generation, with a particular emphasis on knowledge transfer to R as an unseen under-explored target language. Our experiments reveal that LoRA consistently outperforms Compacter and IA^3 in all settings, while Compacter offers significant resource efficiency with minimal performance trade-offs. Additionally, we find that the number of trainable parameters has a greater influence on the functional accuracy of the generated code than PEFT architecture. Our study can direct future research in developing code intelligent tasks for unseen languages including R, as well as the choice of PEFT methods for knowledge transfer, especially when balancing the computational cost and performance.

Empirical Studies of Parameter Efficient Methods for Large Language Models of Code and Knowledge Transfer to R

TL;DR

The paper conducts a comprehensive empirical comparison of three Parameter Efficient Fine-Tuning methods—LoRA, Compacter, and IA

—across code-focused LLMs (CodeT5, CodeLlama) and general LLMs (T5, Llama2) to perform code summarization and code generation, including knowledge transfer to the unseen language R. It finds that LoRA consistently delivers the best performance across tasks and languages, while Compacter provides substantial resource savings with modest losses, and IA

is generally outperformed by the other two. The study also shows that increasing the number of trainable parameters has a larger impact on functional accuracy than the choice of PEFT architecture, and it highlights the challenges of executing executable code in R, especially for weaker base models. These insights guide practical decisions on PEFT method selection given computational constraints and underscore the potential for effective knowledge transfer to low-resource languages like R. The work includes open-source scripts and deeper qualitative analyses of code summarization quality, styling, and attention, contributing to both methodology and applied understanding in software engineering with LLMs.

Abstract

Paper Structure (36 sections, 9 equations, 6 figures, 7 tables)

This paper contains 36 sections, 9 equations, 6 figures, 7 tables.

Introduction
Background
Pre-trained Language Models
Parameter Efficient Fine-Tuning
Study Design
Approach
Downstream Tasks
Datasets and Benchmarks
Evaluation Metrics
Base Models
PEFT Methods
Statistical Test.
Experimental Setup
Results
Code Summarization
...and 21 more sections

Figures (6)

Figure 1: BLEU-4 scores of LoRA, Compacter and IA$^3$ per programming language for code summarization on T5 (top left), CodeT5 (top right), Llama2 (bottom left) and CodeLlama (bottom right) and their respective baselines.
Figure 2: Performance versus parameter budget of Compacter and LoRA for Python and R code generation and summarization tasks using the Llama family of models. BLEU-4 represents performance. The parameter budget is the percentage of trained parameters compared to the number of total parameters.
Figure 3: Performance versus parameter budget of Compacter and LoRA for Python and R code generation and summarization tasks using the T5 family of models. BLEU-4 represents performance. The parameter budget is the percentage of trained parameters compared to the number of total parameters.
Figure 4: Performance difference of LoRA and Compacter on CodeT5 and CodeLlama for Python code generation, with various parameter budgets. Performance is represented by Pass@1 (top) and BLEU-4 (bottom), with a left-pointing arrow and a red area indicating a decrease and a right-pointing arrow and a green area indicating an increase in performance compared to the baseline PEFT. The parameter budget on the y-axis is the ratio of trainable parameters compared to the number of the baseline's trainable parameters.
Figure 5: Performance difference of LoRA and Compacter on CodeT5 and CodeLlama for R code generation, with various parameter budgets. Performance is represented by Pass@1 (top) and BLEU-4 (bottom), with a left-pointing arrow and a red area indicating a decrease and a right-pointing arrow and a green area indicating an increase in performance compared to the baseline PEFT. The parameter budget on the y-axis is the ratio of trainable parameters compared to the number of the baseline's trainable parameters.
...and 1 more figures

Empirical Studies of Parameter Efficient Methods for Large Language Models of Code and Knowledge Transfer to R

TL;DR

Abstract

Empirical Studies of Parameter Efficient Methods for Large Language Models of Code and Knowledge Transfer to R

Authors

TL;DR

Abstract

Table of Contents

Figures (6)