Empirical Studies of Parameter Efficient Methods for Large Language Models of Code and Knowledge Transfer to R
Amirreza Esmaeili, Iman Saberi, Fatemeh H. Fard
TL;DR
The paper conducts a comprehensive empirical comparison of three Parameter Efficient Fine-Tuning methods—LoRA, Compacter, and IA$^3$—across code-focused LLMs (CodeT5, CodeLlama) and general LLMs (T5, Llama2) to perform code summarization and code generation, including knowledge transfer to the unseen language R. It finds that LoRA consistently delivers the best performance across tasks and languages, while Compacter provides substantial resource savings with modest losses, and IA$^3$ is generally outperformed by the other two. The study also shows that increasing the number of trainable parameters has a larger impact on functional accuracy than the choice of PEFT architecture, and it highlights the challenges of executing executable code in R, especially for weaker base models. These insights guide practical decisions on PEFT method selection given computational constraints and underscore the potential for effective knowledge transfer to low-resource languages like R. The work includes open-source scripts and deeper qualitative analyses of code summarization quality, styling, and attention, contributing to both methodology and applied understanding in software engineering with LLMs.
Abstract
Parameter Efficient Fine-Tuning (PEFT) methods are proposed as an alternative fine-tuning approach for Large Language Models (LLM) to minimize high training costs. While prior research demonstrates the effectiveness of PEFT methods in knowledge transfer using smaller language models, their application to larger LLMs, particularly in low-resource and unseen programming languages such as R, remains under-explored. In this work, we evaluate PEFT methods, LoRA, Compacter, and IA^3 on LLMs for code summarization and generation, with a particular emphasis on knowledge transfer to R as an unseen under-explored target language. Our experiments reveal that LoRA consistently outperforms Compacter and IA^3 in all settings, while Compacter offers significant resource efficiency with minimal performance trade-offs. Additionally, we find that the number of trainable parameters has a greater influence on the functional accuracy of the generated code than PEFT architecture. Our study can direct future research in developing code intelligent tasks for unseen languages including R, as well as the choice of PEFT methods for knowledge transfer, especially when balancing the computational cost and performance.
