Table of Contents
Fetching ...

AutoLoRA: AutoGuidance Meets Low-Rank Adaptation for Diffusion Models

Artur Kasymov, Marcin Sendera, Michał Stypułkowski, Maciej Zięba, Przemysław Spurek

TL;DR

This work introduces AutoLoRA, a novel guidance technique for diffusion models fine-tuned with the LoRA approach that searches for a trade-off between consistency in the domain represented by LoRA weights and sample diversity from the base conditional diffusion model.

Abstract

Low-rank adaptation (LoRA) is a fine-tuning technique that can be applied to conditional generative diffusion models. LoRA utilizes a small number of context examples to adapt the model to a specific domain, character, style, or concept. However, due to the limited data utilized during training, the fine-tuned model performance is often characterized by strong context bias and a low degree of variability in the generated images. To solve this issue, we introduce AutoLoRA, a novel guidance technique for diffusion models fine-tuned with the LoRA approach. Inspired by other guidance techniques, AutoLoRA searches for a trade-off between consistency in the domain represented by LoRA weights and sample diversity from the base conditional diffusion model. Moreover, we show that incorporating classifier-free guidance for both LoRA fine-tuned and base models leads to generating samples with higher diversity and better quality. The experimental results for several fine-tuned LoRA domains show superiority over existing guidance techniques on selected metrics.

AutoLoRA: AutoGuidance Meets Low-Rank Adaptation for Diffusion Models

TL;DR

This work introduces AutoLoRA, a novel guidance technique for diffusion models fine-tuned with the LoRA approach that searches for a trade-off between consistency in the domain represented by LoRA weights and sample diversity from the base conditional diffusion model.

Abstract

Low-rank adaptation (LoRA) is a fine-tuning technique that can be applied to conditional generative diffusion models. LoRA utilizes a small number of context examples to adapt the model to a specific domain, character, style, or concept. However, due to the limited data utilized during training, the fine-tuned model performance is often characterized by strong context bias and a low degree of variability in the generated images. To solve this issue, we introduce AutoLoRA, a novel guidance technique for diffusion models fine-tuned with the LoRA approach. Inspired by other guidance techniques, AutoLoRA searches for a trade-off between consistency in the domain represented by LoRA weights and sample diversity from the base conditional diffusion model. Moreover, we show that incorporating classifier-free guidance for both LoRA fine-tuned and base models leads to generating samples with higher diversity and better quality. The experimental results for several fine-tuned LoRA domains show superiority over existing guidance techniques on selected metrics.
Paper Structure (24 sections, 8 equations, 8 figures, 2 tables, 2 algorithms)

This paper contains 24 sections, 8 equations, 8 figures, 2 tables, 2 algorithms.

Figures (8)

  • Figure 1: Comparison of the influence of different LoRA scales. Columns correspond to the scales: $0.4$, $0.7$, $1.0$ and $1.3$. All samples are generated from the same initial noise.
  • Figure 2: Comparison of the impact of different Classifier-Free Guidance scales for the vanilla LoRA model with CFG and AutoLoRA.
  • Figure 3: Comparison of the influence of different Classifier-Free Guidance scales for the vanilla LoRA model with CFG (top) and AutoLoRA (bottom). Below the samples, we presented the corresponding value of a CFG scale parameter. All samples are generated from the same initial noise.
  • Figure 4: Comparison of different scale factors (i.e.,$1.5$, $1.75$, $2.0$) in AutoLoRA. Samples in each row are generated from the same noise initial noise using Stable Diffusion XL and Disney Princess LoRA. CFG samples are presented on the left column as a reference.
  • Figure 5: Samples from the same noise using CFG and AutoLoRA for Pixel Art LoRA module and two models -- Stable Diffusion XL (left columns) and Stable Comparison 3 (right columns).
  • ...and 3 more figures