Table of Contents
Fetching ...

Shaping Inductive Bias in Diffusion Models through Frequency-Based Noise Control

Thomas Jiralerspong, Berton Earnshaw, Jason Hartford, Yoshua Bengio, Luca Scimeca

TL;DR

This work addresses how to inject targeted inductive biases into diffusion probabilistic models by shaping the forward noising process in the frequency domain. It introduces frequency diffusion, which uses a frequency weighting $w(\mathbf{f})$ to modulate Gaussian noise during diffusion, thereby steering the model to learn specific spectral components of the data. Empirically, the approach yields improved sampling (as measured by FID/KID) on several datasets and enables selective learning by omitting information in chosen frequency bands, with the two-band mixture showing robust gains across diverse visual domains. The results suggest that frequency-based noise control is a versatile tool for aligning diffusion models with the spectral structure of target data, with potential extensions to dynamic schedules and other inductive-bias mechanisms.

Abstract

Diffusion Probabilistic Models (DPMs) are powerful generative models that have achieved unparalleled success in a number of generative tasks. In this work, we aim to build inductive biases into the training and sampling of diffusion models to better accommodate the target distribution of the data to model. For topologically structured data, we devise a frequency-based noising operator to purposefully manipulate, and set, these inductive biases. We first show that appropriate manipulations of the noising forward process can lead DPMs to focus on particular aspects of the distribution to learn. We show that different datasets necessitate different inductive biases, and that appropriate frequency-based noise control induces increased generative performance compared to standard diffusion. Finally, we demonstrate the possibility of ignoring information at particular frequencies while learning. We show this in an image corruption and recovery task, where we train a DPM to recover the original target distribution after severe noise corruption.

Shaping Inductive Bias in Diffusion Models through Frequency-Based Noise Control

TL;DR

This work addresses how to inject targeted inductive biases into diffusion probabilistic models by shaping the forward noising process in the frequency domain. It introduces frequency diffusion, which uses a frequency weighting to modulate Gaussian noise during diffusion, thereby steering the model to learn specific spectral components of the data. Empirically, the approach yields improved sampling (as measured by FID/KID) on several datasets and enables selective learning by omitting information in chosen frequency bands, with the two-band mixture showing robust gains across diverse visual domains. The results suggest that frequency-based noise control is a versatile tool for aligning diffusion models with the spectral structure of target data, with potential extensions to dynamic schedules and other inductive-bias mechanisms.

Abstract

Diffusion Probabilistic Models (DPMs) are powerful generative models that have achieved unparalleled success in a number of generative tasks. In this work, we aim to build inductive biases into the training and sampling of diffusion models to better accommodate the target distribution of the data to model. For topologically structured data, we devise a frequency-based noising operator to purposefully manipulate, and set, these inductive biases. We first show that appropriate manipulations of the noising forward process can lead DPMs to focus on particular aspects of the distribution to learn. We show that different datasets necessitate different inductive biases, and that appropriate frequency-based noise control induces increased generative performance compared to standard diffusion. Finally, we demonstrate the possibility of ignoring information at particular frequencies while learning. We show this in an image corruption and recovery task, where we train a DPM to recover the original target distribution after severe noise corruption.

Paper Structure

This paper contains 18 sections, 14 equations, 4 figures, 2 tables.

Figures (4)

  • Figure 1: Frequency diffusion under a generalized framework.
  • Figure 2: Power spectra and image visuals of the forward Process in standard diffusion, as compared to high and low-frequency noise settings of a two-band mixture noise parametrization.
  • Figure 3: FID of diffusion samplers trained with various combinations of frequency noise. The settings for $\gamma_l=0.5$ yields standard diffusion training.
  • Figure 4: Samples from the original data distribution, the degraded data distribution, a standard diffusion sampler trained on the degraded data distribution, and a frequency diffusion sampler trained on the degraded data distribution. We generate noise for data corruption in the frequency range [$a_c=0.5$, $b_c=0.6)$].