ComboStoc: Combinatorial Stochasticity for Diffusion Generative Models
Rui Xu, Jiepeng Wang, Hao Pan, Yang Liu, Xin Tong, Shiqing Xin, Changhe Tu, Taku Komura, Wenping Wang
TL;DR
ComboStoc tackles the under-explored issue of combinatorial complexity in diffusion models by introducing asynchronous diffusion schedules that fully sample the space spanned by dimensions and attributes. This simple modification broadens network coverage, accelerates training, and enables new test-time capabilities such as partial preservation and graded conditioning across patches, parts, and features. Empirical results in images (ImageNet) and structured 3D shapes (PartNet) show systematic improvements in FID/FPD/MMD/COV and enable diverse generation tasks, including shape completion and part assembly. The approach provides a practical, broadly applicable principle for leveraging combinatorial structure in diffusion models, with significant implications for controllable generation across modalities.
Abstract
In this paper, we study an under-explored but important factor of diffusion generative models, i.e., the combinatorial complexity. Data samples are generally high-dimensional, and for various structured generation tasks, there are additional attributes which are combined to associate with data samples. We show that the space spanned by the combination of dimensions and attributes is insufficiently sampled by existing training scheme of diffusion generative models, causing degraded test time performance. We present a simple fix to this problem by constructing stochastic processes that fully exploit the combinatorial structures, hence the name ComboStoc. Using this simple strategy, we show that network training is significantly accelerated across diverse data modalities, including images and 3D structured shapes. Moreover, ComboStoc enables a new way of test time generation which uses insynchronized time steps for different dimensions and attributes, thus allowing for varying degrees of control over them.
