Generative diffusion model with inverse renormalization group flows
Kanta Masuki, Yuto Ashida
TL;DR
This work reframes generative diffusion by integrating exact renormalization group (RG) flows, creating a diffusion model (RGDM) that generates data in a coarse-to-fine sequence by reversing RG coarse-graining. The forward process applies scale-dependent colored noise guided by a regulator to progressively erase fine-scale details, while the backward process reconstructs high-resolution structure from a Gaussian fixed-point distribution p_GS ∝ exp(-½∫(∇φ)^2). The approach eliminates ad hoc hyperparameter tuning of noise schedules, improves sample efficiency, and delivers state-of-the-art or competitive results in protein structure prediction and image generation, demonstrating robust, multiscale data modeling with RG-inspired dynamics. The theoretical development, including the Polchinski RG equation and the convex-diffusion flow, provides a rigorous bridge between RG theory and practical diffusion modeling, with broad implications for scalable generation across domains.
Abstract
Diffusion models represent a class of generative models that produce data by denoising a sample corrupted by white noise. Despite the success of diffusion models in computer vision, audio synthesis, and point cloud generation, so far they overlook inherent multiscale structures in data and have a slow generation process due to many iteration steps. In physics, the renormalization group offers a fundamental framework for linking different scales and giving an accurate coarse-grained model. Here we introduce a renormalization group-based diffusion model that leverages multiscale nature of data distributions for realizing a high-quality data generation. In the spirit of renormalization group procedures, we define a flow equation that progressively erases data information from fine-scale details to coarse-grained structures. Through reversing the renormalization group flows, our model is able to generate high-quality samples in a coarse-to-fine manner. We validate the versatility of the model through applications to protein structure prediction and image generation. Our model consistently outperforms conventional diffusion models across standard evaluation metrics, enhancing sample quality and/or accelerating sampling speed by an order of magnitude. The proposed method alleviates the need for data-dependent tuning of hyperparameters in the generative diffusion models, showing promise for systematically increasing sample efficiency based on the concept of the renormalization group.
