Table of Contents
Fetching ...

Mining Your Own Secrets: Diffusion Classifier Scores for Continual Personalization of Text-to-Image Diffusion Models

Saurav Jha, Shiqi Yang, Masato Ishii, Mengjie Zhao, Christian Simon, Muhammad Jehanzeb Mirza, Dong Gong, Lina Yao, Shusuke Takahashi, Yuki Mitsufuji

TL;DR

This work tackles continual personalization of text-to-image diffusion models under replay-free constraints by leveraging diffusion classifier (DC) scores as regularizers. It introduces two consolidation strategies: parameter-space consolidation using Elastic Weight Consolidation (EWC) in the LoRA space, and function-space consolidation via Diffusion Scores Consolidation (DSC) with double distillation guided by DC scores and diffusion noise predictions. Across diverse datasets and long task sequences, the proposed DC-based methods outperform baselines like C-LoRA and TI/CD, while incurring zero additional storage and minimal inference-time overhead. The approach significantly improves forgetting control and plasticity in sequential concept learning, with demonstrated compatibility with VeRA and multi-concept generation, offering a practical, scalable path for user-specific diffusion model personalization.

Abstract

Personalized text-to-image diffusion models have grown popular for their ability to efficiently acquire a new concept from user-defined text descriptions and a few images. However, in the real world, a user may wish to personalize a model on multiple concepts but one at a time, with no access to the data from previous concepts due to storage/privacy concerns. When faced with this continual learning (CL) setup, most personalization methods fail to find a balance between acquiring new concepts and retaining previous ones -- a challenge that continual personalization (CP) aims to solve. Inspired by the successful CL methods that rely on class-specific information for regularization, we resort to the inherent class-conditioned density estimates, also known as diffusion classifier (DC) scores, for continual personalization of text-to-image diffusion models. Namely, we propose using DC scores for regularizing the parameter-space and function-space of text-to-image diffusion models, to achieve continual personalization. Using several diverse evaluation setups, datasets, and metrics, we show that our proposed regularization-based CP methods outperform the state-of-the-art C-LoRA, and other baselines. Finally, by operating in the replay-free CL setup and on low-rank adapters, our method incurs zero storage and parameter overhead, respectively, over the state-of-the-art. Our project page: https://srvcodes.github.io/continual_personalization/

Mining Your Own Secrets: Diffusion Classifier Scores for Continual Personalization of Text-to-Image Diffusion Models

TL;DR

This work tackles continual personalization of text-to-image diffusion models under replay-free constraints by leveraging diffusion classifier (DC) scores as regularizers. It introduces two consolidation strategies: parameter-space consolidation using Elastic Weight Consolidation (EWC) in the LoRA space, and function-space consolidation via Diffusion Scores Consolidation (DSC) with double distillation guided by DC scores and diffusion noise predictions. Across diverse datasets and long task sequences, the proposed DC-based methods outperform baselines like C-LoRA and TI/CD, while incurring zero additional storage and minimal inference-time overhead. The approach significantly improves forgetting control and plasticity in sequential concept learning, with demonstrated compatibility with VeRA and multi-concept generation, offering a practical, scalable path for user-specific diffusion model personalization.

Abstract

Personalized text-to-image diffusion models have grown popular for their ability to efficiently acquire a new concept from user-defined text descriptions and a few images. However, in the real world, a user may wish to personalize a model on multiple concepts but one at a time, with no access to the data from previous concepts due to storage/privacy concerns. When faced with this continual learning (CL) setup, most personalization methods fail to find a balance between acquiring new concepts and retaining previous ones -- a challenge that continual personalization (CP) aims to solve. Inspired by the successful CL methods that rely on class-specific information for regularization, we resort to the inherent class-conditioned density estimates, also known as diffusion classifier (DC) scores, for continual personalization of text-to-image diffusion models. Namely, we propose using DC scores for regularizing the parameter-space and function-space of text-to-image diffusion models, to achieve continual personalization. Using several diverse evaluation setups, datasets, and metrics, we show that our proposed regularization-based CP methods outperform the state-of-the-art C-LoRA, and other baselines. Finally, by operating in the replay-free CL setup and on low-rank adapters, our method incurs zero storage and parameter overhead, respectively, over the state-of-the-art. Our project page: https://srvcodes.github.io/continual_personalization/
Paper Structure (32 sections, 9 equations, 29 figures, 9 tables, 4 algorithms)

This paper contains 32 sections, 9 equations, 29 figures, 9 tables, 4 algorithms.

Figures (29)

  • Figure 1: Task-wise evolution of C-LoRA's: (a) weights, (b) losses.
  • Figure 2: Framework for Continual personalization with Diffusion Classifier (DC) scores.
  • Figure 3: Our consolidation frameworks for: (a) parameter-space, (b) function-space.
  • Figure 4: Qualitative results on CL setups generated after training on all tasks.
  • Figure 5: Top-k eigenvalue analysis for FIM.
  • ...and 24 more figures