Table of Contents
Fetching ...

LoRAX: LoRA eXpandable Networks for Continual Synthetic Image Attribution

Danielle Sullivan-Pao, Nicole Tian, Pooya Khorrami

TL;DR

LoRAX addresses the problem of continual deepfake attribution under evolving generative models by introducing per-task LoRA adapters on a frozen ConViT backbone, forming task-specific feature extractors whose outputs are concatenated into a single super feature for attribution. The method leverages a two-term loss with a diversity component to minimize redundancy among adapters, and it employs exemplar memory to mitigate forgetting across tasks. Empirical results on the CDDB benchmark show LoRAX is competitive with or superior to state-of-the-art class incremental learning methods across memory budgets, while dramatically reducing trainable parameters (e.g., ~2.5M vs ~86M for ConViT Base). The approach also demonstrates the importance of backbone choice, with ConViT-based LoRAX delivering strong performance and substantial memory savings, suggesting practical applicability for scalable, continual deepfake attribution in real-world settings.

Abstract

As generative AI image technologies become more widespread and advanced, there is a growing need for strong attribution models. These models are crucial for verifying the authenticity of images and identifying the architecture of their originating generative models-key to maintaining media integrity. However, attribution models struggle to generalize to unseen models, and traditional fine-tuning methods for updating these models have shown to be impractical in real-world settings. To address these challenges, we propose LoRA eXpandable Networks (LoRAX), a parameter-efficient class incremental algorithm that adapts to novel generative image models without the need for full retraining. Our approach trains an extremely parameter-efficient feature extractor per continual learning task via Low Rank Adaptation. Each task-specific feature extractor learns distinct features while only requiring a small fraction of the parameters present in the underlying feature extractor's backbone model. Our extensive experimentation shows LoRAX outperforms or remains competitive with state-of-the-art class incremental learning algorithms on the Continual Deepfake Detection benchmark across all training scenarios and memory settings, while requiring less than 3% of the number of trainable parameters per feature extractor compared to the full-rank implementation. LoRAX code is available at: https://github.com/mit-ll/lorax_cil.

LoRAX: LoRA eXpandable Networks for Continual Synthetic Image Attribution

TL;DR

LoRAX addresses the problem of continual deepfake attribution under evolving generative models by introducing per-task LoRA adapters on a frozen ConViT backbone, forming task-specific feature extractors whose outputs are concatenated into a single super feature for attribution. The method leverages a two-term loss with a diversity component to minimize redundancy among adapters, and it employs exemplar memory to mitigate forgetting across tasks. Empirical results on the CDDB benchmark show LoRAX is competitive with or superior to state-of-the-art class incremental learning methods across memory budgets, while dramatically reducing trainable parameters (e.g., ~2.5M vs ~86M for ConViT Base). The approach also demonstrates the importance of backbone choice, with ConViT-based LoRAX delivering strong performance and substantial memory savings, suggesting practical applicability for scalable, continual deepfake attribution in real-world settings.

Abstract

As generative AI image technologies become more widespread and advanced, there is a growing need for strong attribution models. These models are crucial for verifying the authenticity of images and identifying the architecture of their originating generative models-key to maintaining media integrity. However, attribution models struggle to generalize to unseen models, and traditional fine-tuning methods for updating these models have shown to be impractical in real-world settings. To address these challenges, we propose LoRA eXpandable Networks (LoRAX), a parameter-efficient class incremental algorithm that adapts to novel generative image models without the need for full retraining. Our approach trains an extremely parameter-efficient feature extractor per continual learning task via Low Rank Adaptation. Each task-specific feature extractor learns distinct features while only requiring a small fraction of the parameters present in the underlying feature extractor's backbone model. Our extensive experimentation shows LoRAX outperforms or remains competitive with state-of-the-art class incremental learning algorithms on the Continual Deepfake Detection benchmark across all training scenarios and memory settings, while requiring less than 3% of the number of trainable parameters per feature extractor compared to the full-rank implementation. LoRAX code is available at: https://github.com/mit-ll/lorax_cil.

Paper Structure

This paper contains 34 sections, 5 equations, 9 figures, 6 tables, 2 algorithms.

Figures (9)

  • Figure 1: LoRA eXpandable Network: For each task $i$, a LoRA adapter $\Delta W_i$ is applied to the pretrained backbone network $W$ to form feature extractor $\phi_i$. Input images are passed through each feature extractor, and their output embeddings are concatenated all to form superfeature $E$. $E$ is passed to classifier head $CLF$ to predict image attribution. During training, the most recent task's feature $e_i$ is extracted and passed to diversity head $DIV$ to reduce redundancy between new and old features. $F^N$ refers to the model after $N$ tasks.
  • Figure 2: Comparison of Backbone Performance across CIL Algorithms. Black hatching: top within CIL algorithm for the scenario & memory setting; Red hatching: top overall for the scenario & memory setting.
  • Figure 3: CIL Algorithm Performance vs. Number of Trainable Parameters. Black outline: top performer within CIL algorithm; Red outline: top overall.
  • Figure 4: Effect of Image Resizing on Final Task Accuracy (ConViT Base, r=64, $M$=500) The final accuracy of each task with (blue) and without resizing (coral) after training on the final CIL task. Tasks are plotted in the order they appear in the data stream (e.g. GauGAN is task 1 of the Easy scenario).
  • Figure 5: LoRAX AAF Across Adapter Combinations (Memory = 500) The weight matrices we applied adapters to is shown on the x-axis. All refers to $W_v, W_{qk}, W_{qkv}$, and the ConViT positional embeddings.
  • ...and 4 more figures