General and Efficient Steering of Unconditional Diffusion
Qingsong Wang, Mikhail Belkin, Yusu Wang
TL;DR
This work targets controllable generation with unconditional diffusion models without inference-time gradients. It introduces Noise-Aligned RFM Steering (NA-RFM), which offline learns class directions via PCA statistics and Recursive Feature Machines, then applies two-stage, gradient-free guidance: noise alignment at high noise for coarse structure and RFM-based activation steering at lower noise for fine-grained control. The method shows substantial accuracy and image-quality improvements over gradient-based baselines across CIFAR-10, ImageNet, CelebA-HQ, and Birds-525, while achieving major inference speedups and requiring zero classifier evaluations at inference. The results demonstrate a scalable, generalizable approach to post-hoc controllable diffusion that leverages transferability of activation-space directions across timesteps and samples, reducing computational overhead without sacrificing fidelity.
Abstract
Guiding unconditional diffusion models typically requires either retraining with conditional inputs or per-step gradient computations (e.g., classifier-based guidance), both of which incur substantial computational overhead. We present a general recipe for efficiently steering unconditional diffusion {without gradient guidance during inference}, enabling fast controllable generation. Our approach is built on two observations about diffusion model structure: Noise Alignment: even in early, highly corrupted stages, coarse semantic steering is possible using a lightweight, offline-computed guidance signal, avoiding any per-step or per-sample gradients. Transferable concept vectors: a concept direction in activation space once learned transfers across both {timesteps} and {samples}; the same fixed steering vector learned near low noise level remains effective when injected at intermediate noise levels for every generation trajectory, providing refined conditional control with efficiency. Such concept directions can be efficiently and reliably identified via Recursive Feature Machine (RFM), a light-weight backpropagation-free feature learning method. Experiments on CIFAR-10, ImageNet, and CelebA demonstrate improved accuracy/quality over gradient-based guidance, while achieving significant inference speedups.
