Table of Contents
Fetching ...

Deep Generative Continual Learning using Functional LoRA: FunLoRA

Victor Enescu, Hichem Sahbi

TL;DR

This work tackles catastrophic forgetting in continual learning for deep generative models by introducing FunLoRA, a parameter-efficient conditioning mechanism that uses rank-1 LoRA updates whose effective rank is increased via functional transforms such as cosine, rshift, and powers applied to convolutional layers within a conditional flow-matching backbone. The approach trains only current-task parameters and uses a layer-importance criterion to select a small subset of the U-Net backbone for adaptation, avoiding retraining on past data and reducing memory and compute. The paper formalizes conditional flow matching with $L_{CFM}$ and defines FunLoRA as $\mathbf{F}_{\mathbf{y}} = \frac{1}{p} \sum_{i=1}^{p} \alpha_i f_i(\mathbf{A}_{\mathbf{y}}, \mathbf{B}_{\mathbf{y}})$, enabling expressive class-conditioned conditioning while remaining parameter-efficient. Empirical results on CIFAR-100 and ImageNet-100 show that FunLoRA outperforms diffusion-based incremental methods and even some pre-trained stable-diffusion baselines, with substantially fewer extra parameters and faster sampling, and gains are further amplified by COCO pretraining.

Abstract

Continual adaptation of deep generative models holds tremendous potential and critical importance, given their rapid and expanding usage in text and vision based applications. Incremental training, however, remains highly challenging due to catastrophic forgetting phenomenon, which makes it difficult for neural networks to effectively incorporate new knowledge. A common strategy consists in retraining the generative model on its own synthetic data in order to mitigate forgetting. Yet, such an approach faces two major limitations: (i) the continually increasing training time eventually becomes intractable, and (ii) reliance on synthetic data inevitably leads to long-term performance degradation, since synthetic samples lack the richness of real training data. In this paper, we attenuate these issues by designing a novel and more expressive conditioning mechanism for generative models based on low rank adaptation (LoRA), that exclusively employs rank 1 matrices, whose reparametrized matrix rank is functionally increased using carefully selected functions -- and dubbed functional LoRA: FunLoRA. Using this dynamic conditioning, the generative model is guaranteed to avoid catastrophic forgetting and needs only to be trained on data from the current task. Extensive experiments using flow-matching based models trained from scratch, showcase that our proposed parameter-efficient fine-tuning (PEFT) method surpasses prior state-of-the-art results based on diffusion models, reaching higher classification accuracy scores, while only requiring a fraction of the memory cost and sampling time.

Deep Generative Continual Learning using Functional LoRA: FunLoRA

TL;DR

This work tackles catastrophic forgetting in continual learning for deep generative models by introducing FunLoRA, a parameter-efficient conditioning mechanism that uses rank-1 LoRA updates whose effective rank is increased via functional transforms such as cosine, rshift, and powers applied to convolutional layers within a conditional flow-matching backbone. The approach trains only current-task parameters and uses a layer-importance criterion to select a small subset of the U-Net backbone for adaptation, avoiding retraining on past data and reducing memory and compute. The paper formalizes conditional flow matching with and defines FunLoRA as , enabling expressive class-conditioned conditioning while remaining parameter-efficient. Empirical results on CIFAR-100 and ImageNet-100 show that FunLoRA outperforms diffusion-based incremental methods and even some pre-trained stable-diffusion baselines, with substantially fewer extra parameters and faster sampling, and gains are further amplified by COCO pretraining.

Abstract

Continual adaptation of deep generative models holds tremendous potential and critical importance, given their rapid and expanding usage in text and vision based applications. Incremental training, however, remains highly challenging due to catastrophic forgetting phenomenon, which makes it difficult for neural networks to effectively incorporate new knowledge. A common strategy consists in retraining the generative model on its own synthetic data in order to mitigate forgetting. Yet, such an approach faces two major limitations: (i) the continually increasing training time eventually becomes intractable, and (ii) reliance on synthetic data inevitably leads to long-term performance degradation, since synthetic samples lack the richness of real training data. In this paper, we attenuate these issues by designing a novel and more expressive conditioning mechanism for generative models based on low rank adaptation (LoRA), that exclusively employs rank 1 matrices, whose reparametrized matrix rank is functionally increased using carefully selected functions -- and dubbed functional LoRA: FunLoRA. Using this dynamic conditioning, the generative model is guaranteed to avoid catastrophic forgetting and needs only to be trained on data from the current task. Extensive experiments using flow-matching based models trained from scratch, showcase that our proposed parameter-efficient fine-tuning (PEFT) method surpasses prior state-of-the-art results based on diffusion models, reaching higher classification accuracy scores, while only requiring a fraction of the memory cost and sampling time.

Paper Structure

This paper contains 22 sections, 23 equations, 8 figures, 12 tables, 1 algorithm.

Figures (8)

  • Figure 1: Visualization of the proposed LoRA reparametrization applied to rescale all the filters in a convolutional layer. Given the convolutional weights ${\mathbf{W}}_{0} \in \mathbb{R}^{C_{out} \times C_{in} \times {s} \times {s}}$, we learn two low rank matrices ${\mathbf{A}}_{{\mathbf{y}}} \in \mathbb{R}^{C_{out} \times 1}$ and ${\mathbf{B}}_{{\mathbf{y}}} \in \mathbb{R}^{1 \times C_{in}}$ for each class so that their product gives a matrix ${\mathbf{F}}_{{\mathbf{y}}} \in \mathbb{R}^{C_{out} \times C_{in}}$ which spans all filters, and can be used to modify their values. To further increase the rank of this matrix ${\mathbf{F}}_{{\mathbf{y}}}$ (which is one by default), functions $\{f_i\}_i$ are applied to each entry in ${\mathbf{F}}_{{\mathbf{y}}}$ or to ${\mathbf{A}}_{{\mathbf{y}}}$ and ${\mathbf{B}}_{{\mathbf{y}}}$ separately, before taking the product.
  • Figure 2: Illustration of the U-Net whose convolutional layers with index $l$, are adapted using the class conditional functional matrix $\mathbf{F}^{l}_{{\mathbf{y}}}$. Exceptionally in this Figure and to avoid cluttered notations, layer indices ($l$) are used as superscripts.
  • Figure 3: Average importance $I_{l}$ (as in Eq \ref{['eq:importance_layer_class_avg']}) calculated for each convolutional layer in the U-Net model, after incrementally training a flow matching model on CIFAR100. Layers with values that are furthest from 0 are the most important to adapt when learning a new class. Overall, the layers requiring less modifications are generally located next to the input and the output of the U-Net model. Results are obtained on CIFAR100 with a split of 20-20.
  • Figure 4: i) This table provides a comprehensive analysis of LoRA matrix characteristics: (i) the maximum LoRA matrix rank across all layers as the number of epochs increases in the left. (ii) the average rank of the LoRA matrix across all layers (indices 40 to 54) is presented in the right. Notably, the "rshift-10" operation consistently increases the rank to 10, corresponding to the total number of distinct functions applied. Conversely, the "cosine" operation achieves the highest maximum rank, potentially allowing for greater localized expressiveness, whereas the "power" function only reaches a maximum rank of 4.
  • Figure 5: Illustration of the ponderations values $\alpha_{i}$, and the learned frequencies $\boldsymbol{\omega}_i$ for each adapted layer after training our incremental flow matching model using the $\mathbf{F}_{{\mathbf{y}}, 10}^{\operatorname{cos, \boldsymbol{\omega}}}$ reparametrization. Blue lines and shades represent the mean and standard deviation calculated on all classes.
  • ...and 3 more figures