FeRA: Frequency-Energy Constrained Routing for Effective Diffusion Adaptation Fine-Tuning
Bo Yin, Xiaobin Hu, Xingyu Zhou, Peng-Tao Jiang, Yue Liao, Junwei Zhu, Jiangning Zhang, Ying Tai, Chengjie Wang, Shuicheng Yan
TL;DR
FeRA tackles the challenge of adapting large diffusion backbones under parameter-efficient fine-tuning by exploiting the frequency-energy evolution observed during denoising. It introduces a compact Frequency-Energy Indicator, a Soft Frequency Router that blends multiple frequency-specific adapters, and a Frequency-Energy Consistency Loss to stabilize training, all operating on latent spectral states. The approach demonstrates consistent gains in generation fidelity, style control, and identity preservation across diverse backbones and resolutions, with only modest runtime overhead. By aligning adaptation with the intrinsic spectral progression, FeRA provides a simple, stable, and transferable framework for diffusion adaptation.
Abstract
Diffusion models have achieved remarkable success in generative modeling, yet how to effectively adapt large pretrained models to new tasks remains challenging. We revisit the reconstruction behavior of diffusion models during denoising to unveil the underlying frequency energy mechanism governing this process. Building upon this observation, we propose FeRA, a frequency driven fine tuning framework that aligns parameter updates with the intrinsic frequency energy progression of diffusion. FeRA establishes a comprehensive frequency energy framework for effective diffusion adaptation fine tuning, comprising three synergistic components: (i) a compact frequency energy indicator that characterizes the latent bandwise energy distribution, (ii) a soft frequency router that adaptively fuses multiple frequency specific adapter experts, and (iii) a frequency energy consistency regularization that stabilizes diffusion optimization and ensures coherent adaptation across bands. Routing operates in both training and inference, with inference time routing dynamically determined by the latent frequency energy. It integrates seamlessly with adapter based tuning schemes and generalizes well across diffusion backbones and resolutions. By aligning adaptation with the frequency energy mechanism, FeRA provides a simple, stable, and compatible paradigm for effective and robust diffusion model adaptation.
