Fractal Generative Models
Tianhong Li, Qinyi Sun, Lijie Fan, Kaiming He
TL;DR
The paper introduces fractal generative models, a modular framework that recursively composes generative modules to create self-similar architectures for modeling high-dimensional, non-sequential data. It instantiates this idea with autoregressive generators, forming FractalAR and FractalMAR variants, and demonstrates pixel-by-pixel image generation on ImageNet, achieving competitive likelihoods on 64×64 images and high-quality 256×256 samples with scalable compute. The approach leverages a divide-and-conquer, hierarchical structure to reduce computation compared to full-scale attention and tokenization-based methods, while enabling interpretable, controllable generation. These results suggest fractal modularization as a promising paradigm for future generative modeling across data with intrinsic multi-scale structure.
Abstract
Modularization is a cornerstone of computer science, abstracting complex functions into atomic building blocks. In this paper, we introduce a new level of modularization by abstracting generative models into atomic generative modules. Analogous to fractals in mathematics, our method constructs a new type of generative model by recursively invoking atomic generative modules, resulting in self-similar fractal architectures that we call fractal generative models. As a running example, we instantiate our fractal framework using autoregressive models as the atomic generative modules and examine it on the challenging task of pixel-by-pixel image generation, demonstrating strong performance in both likelihood estimation and generation quality. We hope this work could open a new paradigm in generative modeling and provide a fertile ground for future research. Code is available at https://github.com/LTH14/fractalgen.
