Adaptive Principal Components Allocation with the $\ell_{2,g}$-regularized Gaussian Graphical Model for Efficient Fine-Tuning Large Models

Jingjing Zheng; Yankai Cao

Adaptive Principal Components Allocation with the $\ell_{2,g}$-regularized Gaussian Graphical Model for Efficient Fine-Tuning Large Models

Jingjing Zheng, Yankai Cao

TL;DR

This paper tackles the challenge of parameter-efficient fine-tuning for very large models by modeling inter-parameter interactions with a Gaussian Graphical Model. It introduces a novel $\ell_{2,g}$-regularized GGM and an SVD-based node construction to selectively train principal components per layer, capturing global dependencies that local low-rank methods miss. A BCD optimization framework solves the non-convex $\ell_{2,g}$ objective via a coupled $(\Omega,\Delta)$ formulation, enabling effective node selection through structural sparsity and important-node metrics. Empirical results on the GLUE benchmark with RoBERTa-Base demonstrate competitive performance with significantly fewer trainable parameters, and ablations show the value of including an important-nodes mechanism. Overall, the work advances PEFT by integrating global dependency modeling and non-convex sparsity to achieve both efficiency and effectiveness in fine-tuning.

Abstract

In this work, we propose a novel Parameter-Efficient Fine-Tuning (PEFT) approach based on Gaussian Graphical Models (GGMs), marking the first application of GGMs to PEFT tasks, to the best of our knowledge. The proposed method utilizes the $\ell_{2,g}$-norm to effectively select critical parameters and capture global dependencies. The resulting non-convex optimization problem is efficiently solved using a Block Coordinate Descent (BCD) algorithm. Experimental results on the GLUE benchmark [24] for fine-tuning RoBERTa-Base [18] demonstrate the effectiveness of the proposed approach, achieving competitive performance with significantly fewer trainable parameters. The code for this work is available at: https://github.com/jzheng20/Course projects.git.

Adaptive Principal Components Allocation with the $\ell_{2,g}$-regularized Gaussian Graphical Model for Efficient Fine-Tuning Large Models

TL;DR

This paper tackles the challenge of parameter-efficient fine-tuning for very large models by modeling inter-parameter interactions with a Gaussian Graphical Model. It introduces a novel

-regularized GGM and an SVD-based node construction to selectively train principal components per layer, capturing global dependencies that local low-rank methods miss. A BCD optimization framework solves the non-convex

objective via a coupled

formulation, enabling effective node selection through structural sparsity and important-node metrics. Empirical results on the GLUE benchmark with RoBERTa-Base demonstrate competitive performance with significantly fewer trainable parameters, and ablations show the value of including an important-nodes mechanism. Overall, the work advances PEFT by integrating global dependency modeling and non-convex sparsity to achieve both efficiency and effectiveness in fine-tuning.

Abstract

-norm to effectively select critical parameters and capture global dependencies. The resulting non-convex optimization problem is efficiently solved using a Block Coordinate Descent (BCD) algorithm. Experimental results on the GLUE benchmark [24] for fine-tuning RoBERTa-Base [18] demonstrate the effectiveness of the proposed approach, achieving competitive performance with significantly fewer trainable parameters. The code for this work is available at: https://github.com/jzheng20/Course projects.git.

Adaptive Principal Components Allocation with the $\ell_{2,g}$-regularized Gaussian Graphical Model for Efficient Fine-Tuning Large Models

TL;DR

Abstract

Adaptive Principal Components Allocation with the $\ell_{2,g}$-regularized Gaussian Graphical Model for Efficient Fine-Tuning Large Models

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (2)