Boomerang: Local sampling on image manifolds using diffusion models
Lorenzo Luzi, Paul M Mayer, Josue Casco-Rodriguez, Ali Siahkoohi, Richard G. Baraniuk
TL;DR
Boomerang addresses the need for local sampling on image manifolds by introducing a locality-controllable procedure that partially forward-diffuses an input image up to $t_{Boom}$ steps and then performs a full reverse diffusion to land on nearby manifold points $\mathbf{x}_0'$. The approach is compatible with any pretrained diffusion backbone and requires no retraining or architecture changes, with $t_{Boom}$ acting as the sole control for locality versus global sampling ($t_{Boom}=T$ yields global samples). The authors demonstrate three practical applications: privacy-preserving anonymization, data augmentation with improved generalization over state-of-the-art synthetic augmentation, and perceptual resolution enhancement (PRE) that emphasizes perceptual quality. This work has practical impact by enabling targeted, efficient manipulation of image content and quality without retraining, using readily available diffusion models.
Abstract
The inference stage of diffusion models can be seen as running a reverse-time diffusion stochastic differential equation, where samples from a Gaussian latent distribution are transformed into samples from a target distribution that usually reside on a low-dimensional manifold, e.g., an image manifold. The intermediate values between the initial latent space and the image manifold can be interpreted as noisy images, with the amount of noise determined by the forward diffusion process noise schedule. We utilize this interpretation to present Boomerang, an approach for local sampling of image manifolds. As implied by its name, Boomerang local sampling involves adding noise to an input image, moving it closer to the latent space, and then mapping it back to the image manifold through a partial reverse diffusion process. Thus, Boomerang generates images on the manifold that are ``similar,'' but nonidentical, to the original input image. We can control the proximity of the generated images to the original by adjusting the amount of noise added. Furthermore, due to the stochastic nature of the reverse diffusion process in Boomerang, the generated images display a certain degree of stochasticity, allowing us to obtain local samples from the manifold without encountering any duplicates. Boomerang offers the flexibility to work seamlessly with any pretrained diffusion model, such as Stable Diffusion, without necessitating any adjustments to the reverse diffusion process. We present three applications for Boomerang. First, we provide a framework for constructing privacy-preserving datasets having controllable degrees of anonymity. Second, we show that using Boomerang for data augmentation increases generalization performance and outperforms state-of-the-art synthetic data augmentation. Lastly, we introduce a perceptual image enhancement framework, which enables resolution enhancement.
