Table of Contents
Fetching ...

Continual Personalization for Diffusion Models

Yu-Chien Liao, Jr-Jen Chen, Chi-Pin Huang, Ci-Siang Lin, Meng-Lin Wu, Yu-Chiang Frank Wang

TL;DR

This work tackles continual personalization of diffusion models by introducing Concept Neuron Selection (CNS), a neuron-level approach that automatically identifies concept-specific neurons in cross-attention and updates only those units in an incremental training regime. CNS differentiates base neurons (concept-responsive) from general neurons (responsible for generic image generation) using calibrated prompts, and defines concept neurons as the intersection of base and non-general masks, enabling zero-shot preservation via a continual regularization loss $L_{reg}$ that ties new updates to both prior personalized concepts and the original pretrained weights. The framework is fusion-free, requiring no extra LoRA storage, and demonstrates state-of-the-art performance on a 20-concept real-world dataset with efficient updates (about $0.13\%$ of parameters for a single concept) while maintaining high image- and text-alignment. Empirically, CNS outperforms baselines in single- and multi-concept settings, shows robust resistance to catastrophic forgetting, and supports region control tactics, with strong qualitative and quantitative results and broad potential for extension to other modalities and knowledge-editing tasks.

Abstract

Updating diffusion models in an incremental setting would be practical in real-world applications yet computationally challenging. We present a novel learning strategy of Concept Neuron Selection (CNS), a simple yet effective approach to perform personalization in a continual learning scheme. CNS uniquely identifies neurons in diffusion models that are closely related to the target concepts. In order to mitigate catastrophic forgetting problems while preserving zero-shot text-to-image generation ability, CNS finetunes concept neurons in an incremental manner and jointly preserves knowledge learned of previous concepts. Evaluation of real-world datasets demonstrates that CNS achieves state-of-the-art performance with minimal parameter adjustments, outperforming previous methods in both single and multi-concept personalization works. CNS also achieves fusion-free operation, reducing memory storage and processing time for continual personalization.

Continual Personalization for Diffusion Models

TL;DR

This work tackles continual personalization of diffusion models by introducing Concept Neuron Selection (CNS), a neuron-level approach that automatically identifies concept-specific neurons in cross-attention and updates only those units in an incremental training regime. CNS differentiates base neurons (concept-responsive) from general neurons (responsible for generic image generation) using calibrated prompts, and defines concept neurons as the intersection of base and non-general masks, enabling zero-shot preservation via a continual regularization loss that ties new updates to both prior personalized concepts and the original pretrained weights. The framework is fusion-free, requiring no extra LoRA storage, and demonstrates state-of-the-art performance on a 20-concept real-world dataset with efficient updates (about of parameters for a single concept) while maintaining high image- and text-alignment. Empirically, CNS outperforms baselines in single- and multi-concept settings, shows robust resistance to catastrophic forgetting, and supports region control tactics, with strong qualitative and quantitative results and broad potential for extension to other modalities and knowledge-editing tasks.

Abstract

Updating diffusion models in an incremental setting would be practical in real-world applications yet computationally challenging. We present a novel learning strategy of Concept Neuron Selection (CNS), a simple yet effective approach to perform personalization in a continual learning scheme. CNS uniquely identifies neurons in diffusion models that are closely related to the target concepts. In order to mitigate catastrophic forgetting problems while preserving zero-shot text-to-image generation ability, CNS finetunes concept neurons in an incremental manner and jointly preserves knowledge learned of previous concepts. Evaluation of real-world datasets demonstrates that CNS achieves state-of-the-art performance with minimal parameter adjustments, outperforming previous methods in both single and multi-concept personalization works. CNS also achieves fusion-free operation, reducing memory storage and processing time for continual personalization.

Paper Structure

This paper contains 46 sections, 8 equations, 9 figures, 3 tables.

Figures (9)

  • Figure 1: Continual personalization. We present Concept Neuron Selection, CNS, a simple yet effective approach to incrementally customize visual concepts. By finetuning concept-related neurons, CNS preserves the zero-shot capabilities of pretrained diffusion models and alleviates catastrophic forgetting problems.
  • Figure 2: Overview of CNS. The proposed framework for neuron selection consists of (a) base neuron selection, (b) general neuron selection, and (c) concept neuron separation. With sparse, concept-specific neurons automatically selected fro each concept, the proposed incremental finetuning scheme in (d) update the text-to-image diffusion model for continual personalization.
  • Figure 3: Overlapped percentage of base neurons across images. By increasing the number of text prompts, we observe a high percentage (around 53%) of base neurons shared across the resulting images. This suggests that a large portion of base neurons share the goal of image generation, not concept personalization.
  • Figure 4: Qualitative visualization. Note that only Continual Diffusion dong2024continually and CNS are capable of performing continual personalization, while Mix-of-Show gu2024mix and Orthogonal Adaptation po2024orthogonal require to keep LoRAs for each concept for personalization. It can be seen that our personalized outputs match concepts learned across different time, alleviating appearance leakage and catastrophic forgetting problems.
  • Figure 5: Performance degradation on the first concept over time. Compared to the ablated version of our method, the full version of CNS is sufficiently robust during continual learning, resulting in negligible degradation on the first concept. This confirms our ability in alleviating catastrophe forgetting problems.
  • ...and 4 more figures