Table of Contents
Fetching ...

TLoRA: Tri-Matrix Low-Rank Adaptation of Large Language Models

Tanvir Islam

TL;DR

The paper addresses the high resource cost of fine-tuning large language models by introducing TLoRA, a tri-matrix low-rank adaptation that fixes two random matrices and trains a middle matrix with a learnable scaling factor. This design yields substantial parameter savings over full fine-tuning and LoRA while maintaining competitive performance on GLUE tasks with RoBERTa-large. Empirical results show strong task accuracy and remarkable parameter efficiency, along with insight into adaptation dynamics—B learns Gaussian-like updates, scaling factors vary by layer, and TLoRA's updates align closely with LoRA in direction and magnitude. The work demonstrates a practical, scalable approach to resource-efficient LLM adaptation suitable for constrained hardware and large-scale deployment.

Abstract

We propose TLoRA, a novel tri-matrix low-rank adaptation method that decomposes weight updates into three matrices: two fixed random matrices and one trainable matrix, combined with a learnable, layer-wise scaling factor. This tri-matrix design enables TLoRA to achieve highly efficient parameter adaptation while introducing minimal additional computational overhead. Through extensive experiments on the GLUE benchmark, we demonstrate that TLoRA achieves comparable performance to existing low-rank methods such as LoRA and Adapter-based techniques, while requiring significantly fewer trainable parameters. Analyzing the adaptation dynamics, we observe that TLoRA exhibits Gaussian-like weight distributions, stable parameter norms, and scaling factor variability across layers, further highlighting its expressive power and adaptability. Additionally, we show that TLoRA closely resembles LoRA in its eigenvalue distributions, parameter norms, and cosine similarity of updates, underscoring its ability to effectively approximate LoRA's adaptation behavior. Our results establish TLoRA as a highly efficient and effective fine-tuning method for LLMs, offering a significant step forward in resource-efficient model adaptation.

TLoRA: Tri-Matrix Low-Rank Adaptation of Large Language Models

TL;DR

The paper addresses the high resource cost of fine-tuning large language models by introducing TLoRA, a tri-matrix low-rank adaptation that fixes two random matrices and trains a middle matrix with a learnable scaling factor. This design yields substantial parameter savings over full fine-tuning and LoRA while maintaining competitive performance on GLUE tasks with RoBERTa-large. Empirical results show strong task accuracy and remarkable parameter efficiency, along with insight into adaptation dynamics—B learns Gaussian-like updates, scaling factors vary by layer, and TLoRA's updates align closely with LoRA in direction and magnitude. The work demonstrates a practical, scalable approach to resource-efficient LLM adaptation suitable for constrained hardware and large-scale deployment.

Abstract

We propose TLoRA, a novel tri-matrix low-rank adaptation method that decomposes weight updates into three matrices: two fixed random matrices and one trainable matrix, combined with a learnable, layer-wise scaling factor. This tri-matrix design enables TLoRA to achieve highly efficient parameter adaptation while introducing minimal additional computational overhead. Through extensive experiments on the GLUE benchmark, we demonstrate that TLoRA achieves comparable performance to existing low-rank methods such as LoRA and Adapter-based techniques, while requiring significantly fewer trainable parameters. Analyzing the adaptation dynamics, we observe that TLoRA exhibits Gaussian-like weight distributions, stable parameter norms, and scaling factor variability across layers, further highlighting its expressive power and adaptability. Additionally, we show that TLoRA closely resembles LoRA in its eigenvalue distributions, parameter norms, and cosine similarity of updates, underscoring its ability to effectively approximate LoRA's adaptation behavior. Our results establish TLoRA as a highly efficient and effective fine-tuning method for LLMs, offering a significant step forward in resource-efficient model adaptation.

Paper Structure

This paper contains 14 sections, 10 equations, 10 figures, 3 tables.

Figures (10)

  • Figure 1: Schematic representation of TLoRA. The weight update is decomposed into a tri-matrix structure consisting of two fixed random matrices $A$ and $C$, and a trainable matrix B. The input $x$ is projected through the sequence of matrices $A,\ B,\ C$, followed by a learnable scaling factor to control the magnitude of the adaptation. This tri-matrix design enables efficient low-rank adaptation while minimizing trainable parameters.
  • Figure 2: Comparison of trainable parameters across different fine-tuning methods. The bar chart illustrates the parameter count for full fine-tuning (Full FT), LoRA (rank $r$=32), and TLoRA (rank $r = 32$). TLoRA significantly reduces the number of trainable parameters compared to LoRA, showcasing its parameter efficiency while maintaining competitive performance.
  • Figure 3: Training and validation loss curves for the MRPC dataset. The figure demonstrates the stability of TLoRA during training over 30 epochs.
  • Figure 4: Weight distribution histograms for the original weight matrix and the TLoRA matrices $A$, $B$, and $C$. The matrices $A$ and $C$ are randomly initialized and remain fixed, following a Gaussian distribution. In contrast, the trainable matrix $B$, which is initialized to zero, evolves during training and adopts a Gaussian-like distribution, highlighting the effectiveness of TLoRA' s tri-matrix decomposition.
  • Figure 5: Evolution of L2 norms for the TLoRA $B$ matrix over training epochs.
  • ...and 5 more figures