Shaping Inductive Bias in Diffusion Models through Frequency-Based Noise Control
Thomas Jiralerspong, Berton Earnshaw, Jason Hartford, Yoshua Bengio, Luca Scimeca
TL;DR
This work addresses how to inject targeted inductive biases into diffusion probabilistic models by shaping the forward noising process in the frequency domain. It introduces frequency diffusion, which uses a frequency weighting $w(\mathbf{f})$ to modulate Gaussian noise during diffusion, thereby steering the model to learn specific spectral components of the data. Empirically, the approach yields improved sampling (as measured by FID/KID) on several datasets and enables selective learning by omitting information in chosen frequency bands, with the two-band mixture showing robust gains across diverse visual domains. The results suggest that frequency-based noise control is a versatile tool for aligning diffusion models with the spectral structure of target data, with potential extensions to dynamic schedules and other inductive-bias mechanisms.
Abstract
Diffusion Probabilistic Models (DPMs) are powerful generative models that have achieved unparalleled success in a number of generative tasks. In this work, we aim to build inductive biases into the training and sampling of diffusion models to better accommodate the target distribution of the data to model. For topologically structured data, we devise a frequency-based noising operator to purposefully manipulate, and set, these inductive biases. We first show that appropriate manipulations of the noising forward process can lead DPMs to focus on particular aspects of the distribution to learn. We show that different datasets necessitate different inductive biases, and that appropriate frequency-based noise control induces increased generative performance compared to standard diffusion. Finally, we demonstrate the possibility of ignoring information at particular frequencies while learning. We show this in an image corruption and recovery task, where we train a DPM to recover the original target distribution after severe noise corruption.
