Dance recalibration for dance coherency with recurrent convolution block
Seungho Eum, Ihjoon Cho, Junghyeon Kim
TL;DR
The paper tackles long-form, music-conditioned dance generation by identifying coherence issues in Lodge's coarse-to-fine diffusion framework. It introduces Dance Recalibration (DR) and a Pooling Block to propagate sequential context into subsequent frames, yielding R-Lodge, which achieves state-of-the-art dance coherence on the FineDance dataset with improved Beat Alignment Score while maintaining practical runtimes. The approach demonstrates that incorporating lightweight recurrent-like processing into the diffusion-based pipeline can significantly enhance inter-frame continuity without prohibitive computational costs. This work advances long-sequence, music-driven dance synthesis and suggests directions for broader validation and diversity improvements.
Abstract
With the recent advancements in generative AI such as GAN, Diffusion, and VAE, the use of generative AI for dance generation has seen significant progress and received considerable interest. In this study, We propose R-Lodge, an enhanced version of Lodge. R-Lodge incorporates Recurrent Sequential Representation Learning named Dance Recalibration to original coarse-to-fine long dance generation model. R-Lodge utilizes Dance Recalibration method using $N$ Dance Recalibration Block to address the lack of consistency in the coarse dance representation of the Lodge model. By utilizing this method, each generated dance motion incorporates a bit of information from the previous dance motions. We evaluate R-Lodge on FineDance dataset and the results show that R-Lodge enhances the consistency of the whole generated dance motions.
