ASGDiffusion: Parallel High-Resolution Generation with Asynchronous Structure Guidance
Yuming Li, Peidong Jia, Daiwei Hong, Yueru Jia, Qi She, Rui Zhao, Ming Lu, Shanghang Zhang
TL;DR
ASGDiffusion tackles training-free high-resolution image generation by addressing pattern repetition through structure-guided denoising and a cross-attention mask. It introduces an asynchronous structure guidance strategy that enables multi-GPU parallelism, greatly accelerating HR image generation while maintaining semantic coherence. The method integrates with multiple Stable Diffusion variants, delivering strong qualitative and quantitative performance, particularly at resolutions like 2048×2048 and 3072×3072. While ultra-high-resolution limits remain, ASGDiffusion provides a practical, scalable approach for fast, high-quality HR diffusion without additional training.
Abstract
Training-free high-resolution (HR) image generation has garnered significant attention due to the high costs of training large diffusion models. Most existing methods begin by reconstructing the overall structure and then proceed to refine the local details. Despite their advancements, they still face issues with repetitive patterns in HR image generation. Besides, HR generation with diffusion models incurs significant computational costs. Thus, parallel generation is essential for interactive applications. To solve the above limitations, we introduce a novel method named ASGDiffusion for parallel HR generation with Asynchronous Structure Guidance (ASG) using pre-trained diffusion models. To solve the pattern repetition problem of HR image generation, ASGDiffusion leverages the low-resolution (LR) noise weighted by the attention mask as the structure guidance for the denoising step to ensure semantic consistency. The proposed structure guidance can significantly alleviate the pattern repetition problem. To enable parallel generation, we further propose a parallelism strategy, which calculates the patch noises and structure guidance asynchronously. By leveraging multi-GPU parallel acceleration, we significantly accelerate generation speed and reduce memory usage per GPU. Extensive experiments demonstrate that our method effectively and efficiently addresses common issues like pattern repetition and achieves state-of-the-art HR generation.
