Lodge++: High-quality and Long Dance Generation with Vivid Choreography Patterns

Ronghui Li; Hongwen Zhang; Yachao Zhang; Yuxiang Zhang; Youliang Zhang; Jie Guo; Yan Zhang; Xiu Li; Yebin Liu

Lodge++: High-quality and Long Dance Generation with Vivid Choreography Patterns

Ronghui Li, Hongwen Zhang, Yachao Zhang, Yuxiang Zhang, Youliang Zhang, Jie Guo, Yan Zhang, Xiu Li, Yebin Liu

TL;DR

Lodge++ tackles the problem of generating ultra-long, high-quality, music-driven 3D dances by decoupling global choreography from local motion. It introduces a VQ-VAE+GPT based Global Choreography Network to learn rich global patterns and derive dance primitives, followed by a Primitive-based Diffusion Model that denoises in parallel to produce long sequences guided by those primitives. The approach is augmented with a Foot Refine Block, a Multi-Genre Discriminator, and an SDF-based Penetration Guidance to improve physical realism and genre consistency, achieving superior beat alignment and lower self-penetration on the FineDance dataset. Ablation studies and user surveys substantiate the benefits of the global-primitives–diffusion coupling and the proposed physically informed refinements. Overall, Lodge++ advances long-sequence dance generation by delivering coherent choreography and high-detail movement with improved computational efficiency.

Abstract

We propose Lodge++, a choreography framework to generate high-quality, ultra-long, and vivid dances given the music and desired genre. To handle the challenges in computational efficiency, the learning of complex and vivid global choreography patterns, and the physical quality of local dance movements, Lodge++ adopts a two-stage strategy to produce dances from coarse to fine. In the first stage, a global choreography network is designed to generate coarse-grained dance primitives that capture complex global choreography patterns. In the second stage, guided by these dance primitives, a primitive-based dance diffusion model is proposed to further generate high-quality, long-sequence dances in parallel, faithfully adhering to the complex choreography patterns. Additionally, to improve the physical plausibility, Lodge++ employs a penetration guidance module to resolve character self-penetration, a foot refinement module to optimize foot-ground contact, and a multi-genre discriminator to maintain genre consistency throughout the dance. Lodge++ is validated by extensive experiments, which show that our method can rapidly generate ultra-long dances suitable for various dance genres, ensuring well-organized global choreography patterns and high-quality local motion.

Lodge++: High-quality and Long Dance Generation with Vivid Choreography Patterns

TL;DR

Abstract

Lodge++: High-quality and Long Dance Generation with Vivid Choreography Patterns

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (7)