Table of Contents
Fetching ...

AC-LoRA: Auto Component LoRA for Personalized Artistic Style Image Generation

Zhipu Cui, Andong Tian, Zhi Ying, Jialiang Lu

TL;DR

The paper tackles the challenge of selecting the LoRA rank for personalized image generation by introducing AutoComponent-LoRA (AC-LoRA), which automatically separates signal and noise in LoRA updates using SVD and a RESTART-based stochastic adjustment. A dynamic threshold $p$ tied to training loss and epoch governs which components are retained, while low-significance ranks are replaced with Gaussian noise to maintain stability; this yields a substantial reduction in hyperparameter search and training time. The approach is validated on eight artistic styles with metrics including FID, CLIP, DINO, and ImageReward, showing consistent improvements over existing LoRA variants and enabling high-quality personalization on small datasets. The method promises broader applicability to probabilistic generative models and suggests avenues for leveraging pre-trained encoders and improving robustness to noisy inputs.

Abstract

Personalized image generation allows users to preserve styles or subjects of a provided small set of images for further image generation. With the advancement in large text-to-image models, many techniques have been developed to efficiently fine-tune those models for personalization, such as Low Rank Adaptation (LoRA). However, LoRA-based methods often face the challenge of adjusting the rank parameter to achieve satisfactory results. To address this challenge, AutoComponent-LoRA (AC-LoRA) is proposed, which is able to automatically separate the signal component and noise component of the LoRA matrices for fast and efficient personalized artistic style image generation. This method is based on Singular Value Decomposition (SVD) and dynamic heuristics to update the hyperparameters during training. Superior performance over existing methods in overcoming model underfitting or overfitting problems is demonstrated. The results were validated using FID, CLIP, DINO, and ImageReward, achieving an average of 9% improvement.

AC-LoRA: Auto Component LoRA for Personalized Artistic Style Image Generation

TL;DR

The paper tackles the challenge of selecting the LoRA rank for personalized image generation by introducing AutoComponent-LoRA (AC-LoRA), which automatically separates signal and noise in LoRA updates using SVD and a RESTART-based stochastic adjustment. A dynamic threshold tied to training loss and epoch governs which components are retained, while low-significance ranks are replaced with Gaussian noise to maintain stability; this yields a substantial reduction in hyperparameter search and training time. The approach is validated on eight artistic styles with metrics including FID, CLIP, DINO, and ImageReward, showing consistent improvements over existing LoRA variants and enabling high-quality personalization on small datasets. The method promises broader applicability to probabilistic generative models and suggests avenues for leveraging pre-trained encoders and improving robustness to noisy inputs.

Abstract

Personalized image generation allows users to preserve styles or subjects of a provided small set of images for further image generation. With the advancement in large text-to-image models, many techniques have been developed to efficiently fine-tune those models for personalization, such as Low Rank Adaptation (LoRA). However, LoRA-based methods often face the challenge of adjusting the rank parameter to achieve satisfactory results. To address this challenge, AutoComponent-LoRA (AC-LoRA) is proposed, which is able to automatically separate the signal component and noise component of the LoRA matrices for fast and efficient personalized artistic style image generation. This method is based on Singular Value Decomposition (SVD) and dynamic heuristics to update the hyperparameters during training. Superior performance over existing methods in overcoming model underfitting or overfitting problems is demonstrated. The results were validated using FID, CLIP, DINO, and ImageReward, achieving an average of 9% improvement.

Paper Structure

This paper contains 15 sections, 10 equations, 4 figures, 1 table, 1 algorithm.

Figures (4)

  • Figure 1: Personalized image generation results of four artistic styles based on four prompts using AC-LoRA.
  • Figure 2: Comparison of generated images at different ranks
  • Figure 3: The overall pipeline of Auto Component
  • Figure 4: The comparison among AC-LoRA, other LoRAs, and the base model