Table of Contents
Fetching ...

Dataset Size Recovery from LoRA Weights

Mohammad Salama, Jonathan Kahana, Eliahu Horwitz, Yedid Hoshen

TL;DR

The paper introduces Dataset Size Recovery, a task of inferring the number of training images used to fine-tune a model from LoRA weights. It proposes DSiRe, a spectrum-based predictor that uses layer-wise singular values of LoRA matrices and a nearest-neighbor ensemble to predict dataset size, and provides LoRA-WiSE, a large benchmark for evaluation. Across multiple data ranges and backbones, DSiRe achieves low MAEs (e.g., 0.36 on small ranges and ~41.8 on large ranges) and high accuracy, illustrating the feasibility of recovering training data size from LoRA weights. The work discusses implications for privacy, billing, and data-efficiency research, and also offers defenses and avenues for future research, including data-free approaches and pre-training size recovery. Overall, the study establishes a practical vulnerability and a robust evaluation framework for dataset size recovery in PEFT-based models.

Abstract

Model inversion and membership inference attacks aim to reconstruct and verify the data which a model was trained on. However, they are not guaranteed to find all training samples as they do not know the size of the training set. In this paper, we introduce a new task: dataset size recovery, that aims to determine the number of samples used to train a model, directly from its weights. We then propose DSiRe, a method for recovering the number of images used to fine-tune a model, in the common case where fine-tuning uses LoRA. We discover that both the norm and the spectrum of the LoRA matrices are closely linked to the fine-tuning dataset size; we leverage this finding to propose a simple yet effective prediction algorithm. To evaluate dataset size recovery of LoRA weights, we develop and release a new benchmark, LoRA-WiSE, consisting of over 25000 weight snapshots from more than 2000 diverse LoRA fine-tuned models. Our best classifier can predict the number of fine-tuning images with a mean absolute error of 0.36 images, establishing the feasibility of this attack.

Dataset Size Recovery from LoRA Weights

TL;DR

The paper introduces Dataset Size Recovery, a task of inferring the number of training images used to fine-tune a model from LoRA weights. It proposes DSiRe, a spectrum-based predictor that uses layer-wise singular values of LoRA matrices and a nearest-neighbor ensemble to predict dataset size, and provides LoRA-WiSE, a large benchmark for evaluation. Across multiple data ranges and backbones, DSiRe achieves low MAEs (e.g., 0.36 on small ranges and ~41.8 on large ranges) and high accuracy, illustrating the feasibility of recovering training data size from LoRA weights. The work discusses implications for privacy, billing, and data-efficiency research, and also offers defenses and avenues for future research, including data-free approaches and pre-training size recovery. Overall, the study establishes a practical vulnerability and a robust evaluation framework for dataset size recovery in PEFT-based models.

Abstract

Model inversion and membership inference attacks aim to reconstruct and verify the data which a model was trained on. However, they are not guaranteed to find all training samples as they do not know the size of the training set. In this paper, we introduce a new task: dataset size recovery, that aims to determine the number of samples used to train a model, directly from its weights. We then propose DSiRe, a method for recovering the number of images used to fine-tune a model, in the common case where fine-tuning uses LoRA. We discover that both the norm and the spectrum of the LoRA matrices are closely linked to the fine-tuning dataset size; we leverage this finding to propose a simple yet effective prediction algorithm. To evaluate dataset size recovery of LoRA weights, we develop and release a new benchmark, LoRA-WiSE, consisting of over 25000 weight snapshots from more than 2000 diverse LoRA fine-tuned models. Our best classifier can predict the number of fine-tuning images with a mean absolute error of 0.36 images, establishing the feasibility of this attack.
Paper Structure (42 sections, 4 equations, 6 figures, 11 tables)

This paper contains 42 sections, 4 equations, 6 figures, 11 tables.

Figures (6)

  • Figure 1: DSiRe: We introduce the task of dataset size recovery, which aims to recover the dataset size used to LoRA fine-tune a model based on its weights. DSiRe extracts the singular values of each LoRA matrix and treats them as features. These features are then used to train a set of layer-specific nearest-neighbor classifiers which predict the dataset size.
  • Figure 2: Norm and Spectrum of Fine-Tuning Weights vs. Dataset Size. Analysis of $210$ Stable Diffusion 1.5 models fine-tuned on datasets of sizes from $1-6$. (a) Frobenius norm range per dataset size (b) Singular values per dataset size. There is a clear negative correlation between weight/spectrum magnitudes and the size of the fine-tuning dataset.
  • Figure 3: Spectrum Ranges of 2 Different Layers. Singular values distribution of two layers on opposite sides of Stable Diffusion 1.5 UNet, fine-tuned on datasets of sizes $1-6$. (a) First down block (b) Last upper block. The last upper block shows greater separation of singular values compared to the first down block, highlighting that not all layers are born equally for dataset size recovery.
  • Figure 4: DSiRe Confusion Matrix for Medium Data Range in a single experiment. Illustrating DSiRes accuracy in the range of $1-50$ samples, shows that most of the errors are near misses, highlighting DSiRe's precision in dataset size recovery.
  • Figure 5: DSiRes Micro-Dataset Size vs. Accuracy, reported on medium data size range $(1-50)$. Even a single micro-dataset is sufficient for DSiRe to reach $80\%$ accuracy. This demonstrates its effectiveness with limited training data.
  • ...and 1 more figures