Table of Contents
Fetching ...

Harmonious Group Choreography with Trajectory-Controllable Diffusion

Yuqin Dai, Wanlu Zhu, Ronghui Li, Zeping Ren, Xiangzheng Zhou, Jixuan Ying, Jun Li, Jian Yang

TL;DR

This work tackles music-driven group choreography by addressing dancer collisions and foot sliding through a two-stage diffusion framework, TCDiff. A Dance-Trajectory Navigator first generates non-overlapping, disjoint trajectories, then a Trajectory-Specialist Diffusion module produces coordinated movements conditioned on those trajectories, aided by a Footwork Adaptor and a Relative Forward-Kinematic loss to strengthen root-to-joint coherence. Key innovations include the distance-consistency loss for spacing, the Fusion Projection to reduce dancer ambiguity with minimal cost, and the RFK loss to tighten kinematic relationships, all validated on the AIOZ-GDance dataset with superior multi-dancer and single-dancer metrics. The approach yields more realistic, synchronized group performances and offers a scalable path for high-quality group choreography in entertainment and VR settings.

Abstract

Creating group choreography from music is crucial in cultural entertainment and virtual reality, with a focus on generating harmonious movements. Despite growing interest, recent approaches often struggle with two major challenges: multi-dancer collisions and single-dancer foot sliding. To address these challenges, we propose a Trajectory-Controllable Diffusion (TCDiff) framework, which leverages non-overlapping trajectories to ensure coherent and aesthetically pleasing dance movements. To mitigate collisions, we introduce a Dance-Trajectory Navigator that generates collision-free trajectories for multiple dancers, utilizing a distance-consistency loss to maintain optimal spacing. Furthermore, to reduce foot sliding, we present a footwork adaptor that adjusts trajectory displacement between frames, supported by a relative forward-kinematic loss to further reinforce the correlation between movements and trajectories. Experiments demonstrate our method's superiority.

Harmonious Group Choreography with Trajectory-Controllable Diffusion

TL;DR

This work tackles music-driven group choreography by addressing dancer collisions and foot sliding through a two-stage diffusion framework, TCDiff. A Dance-Trajectory Navigator first generates non-overlapping, disjoint trajectories, then a Trajectory-Specialist Diffusion module produces coordinated movements conditioned on those trajectories, aided by a Footwork Adaptor and a Relative Forward-Kinematic loss to strengthen root-to-joint coherence. Key innovations include the distance-consistency loss for spacing, the Fusion Projection to reduce dancer ambiguity with minimal cost, and the RFK loss to tighten kinematic relationships, all validated on the AIOZ-GDance dataset with superior multi-dancer and single-dancer metrics. The approach yields more realistic, synchronized group performances and offers a scalable path for high-quality group choreography in entertainment and VR settings.

Abstract

Creating group choreography from music is crucial in cultural entertainment and virtual reality, with a focus on generating harmonious movements. Despite growing interest, recent approaches often struggle with two major challenges: multi-dancer collisions and single-dancer foot sliding. To address these challenges, we propose a Trajectory-Controllable Diffusion (TCDiff) framework, which leverages non-overlapping trajectories to ensure coherent and aesthetically pleasing dance movements. To mitigate collisions, we introduce a Dance-Trajectory Navigator that generates collision-free trajectories for multiple dancers, utilizing a distance-consistency loss to maintain optimal spacing. Furthermore, to reduce foot sliding, we present a footwork adaptor that adjusts trajectory displacement between frames, supported by a relative forward-kinematic loss to further reinforce the correlation between movements and trajectories. Experiments demonstrate our method's superiority.
Paper Structure (18 sections, 9 equations, 8 figures, 3 tables)

This paper contains 18 sections, 9 equations, 8 figures, 3 tables.

Figures (8)

  • Figure 1: Visualizations of two key issues in baseline models: multi-dancer collisions gcd (in the black box) and single-dancer foot sliding codancers (in the red box). In contrast, our approach eliminates these issues, delivering superior visual aesthetics.
  • Figure 2: Our TCDiff framework consists of two components: Dance-Trajectory Navigator (DTN) and Trajectory-Specialist Diffusion (TSDiff). Initially, DTN is designed to extract disjoint trajectories (dancer positions) for mitigating dancer ambiguity, as dancers' coordinates exhibit distinct differences and are less prone to confusion. Subsequently, TSDiff utilizes the trajectories for conditional diffusion to generate corresponding dance movements. During this process, a Fusion Projection enhances group information before inputting it into the multi-dance transformer, while a footwork adaptor adjusts the final footwork.
  • Figure 3: Fusion Projection Module.
  • Figure 4: Generated results with different dancer counts.
  • Figure 5: Visual comparison with Baselines. These methods often result in collisions (in black box) or lack of foot movements during exchanges. In contrast, our model minimizes dancer overlaps and generates more natural footwork.
  • ...and 3 more figures