Table of Contents
Fetching ...

FeRA: Frequency-Energy Constrained Routing for Effective Diffusion Adaptation Fine-Tuning

Bo Yin, Xiaobin Hu, Xingyu Zhou, Peng-Tao Jiang, Yue Liao, Junwei Zhu, Jiangning Zhang, Ying Tai, Chengjie Wang, Shuicheng Yan

TL;DR

FeRA tackles the challenge of adapting large diffusion backbones under parameter-efficient fine-tuning by exploiting the frequency-energy evolution observed during denoising. It introduces a compact Frequency-Energy Indicator, a Soft Frequency Router that blends multiple frequency-specific adapters, and a Frequency-Energy Consistency Loss to stabilize training, all operating on latent spectral states. The approach demonstrates consistent gains in generation fidelity, style control, and identity preservation across diverse backbones and resolutions, with only modest runtime overhead. By aligning adaptation with the intrinsic spectral progression, FeRA provides a simple, stable, and transferable framework for diffusion adaptation.

Abstract

Diffusion models have achieved remarkable success in generative modeling, yet how to effectively adapt large pretrained models to new tasks remains challenging. We revisit the reconstruction behavior of diffusion models during denoising to unveil the underlying frequency energy mechanism governing this process. Building upon this observation, we propose FeRA, a frequency driven fine tuning framework that aligns parameter updates with the intrinsic frequency energy progression of diffusion. FeRA establishes a comprehensive frequency energy framework for effective diffusion adaptation fine tuning, comprising three synergistic components: (i) a compact frequency energy indicator that characterizes the latent bandwise energy distribution, (ii) a soft frequency router that adaptively fuses multiple frequency specific adapter experts, and (iii) a frequency energy consistency regularization that stabilizes diffusion optimization and ensures coherent adaptation across bands. Routing operates in both training and inference, with inference time routing dynamically determined by the latent frequency energy. It integrates seamlessly with adapter based tuning schemes and generalizes well across diffusion backbones and resolutions. By aligning adaptation with the frequency energy mechanism, FeRA provides a simple, stable, and compatible paradigm for effective and robust diffusion model adaptation.

FeRA: Frequency-Energy Constrained Routing for Effective Diffusion Adaptation Fine-Tuning

TL;DR

FeRA tackles the challenge of adapting large diffusion backbones under parameter-efficient fine-tuning by exploiting the frequency-energy evolution observed during denoising. It introduces a compact Frequency-Energy Indicator, a Soft Frequency Router that blends multiple frequency-specific adapters, and a Frequency-Energy Consistency Loss to stabilize training, all operating on latent spectral states. The approach demonstrates consistent gains in generation fidelity, style control, and identity preservation across diverse backbones and resolutions, with only modest runtime overhead. By aligning adaptation with the intrinsic spectral progression, FeRA provides a simple, stable, and transferable framework for diffusion adaptation.

Abstract

Diffusion models have achieved remarkable success in generative modeling, yet how to effectively adapt large pretrained models to new tasks remains challenging. We revisit the reconstruction behavior of diffusion models during denoising to unveil the underlying frequency energy mechanism governing this process. Building upon this observation, we propose FeRA, a frequency driven fine tuning framework that aligns parameter updates with the intrinsic frequency energy progression of diffusion. FeRA establishes a comprehensive frequency energy framework for effective diffusion adaptation fine tuning, comprising three synergistic components: (i) a compact frequency energy indicator that characterizes the latent bandwise energy distribution, (ii) a soft frequency router that adaptively fuses multiple frequency specific adapter experts, and (iii) a frequency energy consistency regularization that stabilizes diffusion optimization and ensures coherent adaptation across bands. Routing operates in both training and inference, with inference time routing dynamically determined by the latent frequency energy. It integrates seamlessly with adapter based tuning schemes and generalizes well across diffusion backbones and resolutions. By aligning adaptation with the frequency energy mechanism, FeRA provides a simple, stable, and compatible paradigm for effective and robust diffusion model adaptation.

Paper Structure

This paper contains 22 sections, 11 equations, 11 figures, 8 tables.

Figures (11)

  • Figure 1: Frequency-energy evolution during denoising. (a) Visualization of the denoising process. (b) Evolution of frequency-band energies. (c) Frequency-energy distribution across timesteps.
  • Figure 2: The classical parameter-efficient fine-tuning methods.
  • Figure 3: Overview of the FeRA framework. The Frequency-Energy Indicator (FEI) extracted by DoG operators guides a Soft Frequency Router to adaptively blend multiple LoRA experts. A Frequency-Energy Consistency Loss (FECL) further regularizes the spectral alignment between correction and residual during fine-tuning.
  • Figure 4: Comparison of the generated images between different PEFT methods.
  • Figure 5: DreamBooth results across PEFT methods. FeRA delivers more consistent identity and cleaner compositions.
  • ...and 6 more figures