Efficient Fine-Tuning and Concept Suppression for Pruned Diffusion Models
Reza Shirkavand, Peiran Yu, Shangqian Gao, Gowthami Somepalli, Tom Goldstein, Heng Huang
TL;DR
The paper tackles the challenge of deploying efficient diffusion models by integrating pruning, distillation, and concept unlearning into a single bilevel optimization framework. The lower level performs standard diffusion fine-tuning with distillation to restore generation quality, while the upper level directs the model away from unwanted concepts, enabling selective suppression without sacrificing fidelity. The proposed approach, solvable with first-order bilevel methods, outperforms two-stage baselines on artist-style erasure and NSFW content removal, while maintaining strong generation quality on unrelated concepts. This enables safer, more practical deployment of diffusion models in resource-constrained settings. The framework is plug-in compatible with various pruning and unlearning methods, making it broadly applicable for controlled diffusion in real-world applications.
Abstract
Recent advances in diffusion generative models have yielded remarkable progress. While the quality of generated content continues to improve, these models have grown considerably in size and complexity. This increasing computational burden poses significant challenges, particularly in resource-constrained deployment scenarios such as mobile devices. The combination of model pruning and knowledge distillation has emerged as a promising solution to reduce computational demands while preserving generation quality. However, this technique inadvertently propagates undesirable behaviors, including the generation of copyrighted content and unsafe concepts, even when such instances are absent from the fine-tuning dataset. In this paper, we propose a novel bilevel optimization framework for pruned diffusion models that consolidates the fine-tuning and unlearning processes into a unified phase. Our approach maintains the principal advantages of distillation-namely, efficient convergence and style transfer capabilities-while selectively suppressing the generation of unwanted content. This plug-in framework is compatible with various pruning and concept unlearning methods, facilitating efficient, safe deployment of diffusion models in controlled environments.
