MEGADance: Mixture-of-Experts Architecture for Genre-Aware 3D Dance Generation
Kaixing Yang, Xulong Tang, Ziqiao Peng, Yuxuan Hu, Jun He, Hongyan Liu
TL;DR
MEGADance tackles the challenge of genre-aware music-to-dance generation by introducing a two-stage framework that separates choreographic generality from genre-specific styling. It combines Finite Scalar Quantization with kinematic-dynamic constraints to obtain high-fidelity latent dance representations and a Mixture-of-Experts with Universal and Specialized components, powered by a Mamba-Transformer backbone, to map music to the latent space. The approach achieves state-of-the-art performance on FineDance and AIST++ across objective metrics and human studies, while enabling robust genre controllability and efficient inference. This work advances practical, genre-consistent, music-driven 3D dance generation with strong potential for interactive choreography and virtual performance applications.
Abstract
Music-driven 3D dance generation has attracted increasing attention in recent years, with promising applications in choreography, virtual reality, and creative content creation. Previous research has generated promising realistic dance movement from audio signals. However, traditional methods underutilize genre conditioning, often treating it as auxiliary modifiers rather than core semantic drivers. This oversight compromises music-motion synchronization and disrupts dance genre continuity, particularly during complex rhythmic transitions, thereby leading to visually unsatisfactory effects. To address the challenge, we propose MEGADance, a novel architecture for music-driven 3D dance generation. By decoupling choreographic consistency into dance generality and genre specificity, MEGADance demonstrates significant dance quality and strong genre controllability. It consists of two stages: (1) High-Fidelity Dance Quantization Stage (HFDQ), which encodes dance motions into a latent representation by Finite Scalar Quantization (FSQ) and reconstructs them with kinematic-dynamic constraints, and (2) Genre-Aware Dance Generation Stage (GADG), which maps music into the latent representation by synergistic utilization of Mixture-of-Experts (MoE) mechanism with Mamba-Transformer hybrid backbone. Extensive experiments on the FineDance and AIST++ dataset demonstrate the state-of-the-art performance of MEGADance both qualitatively and quantitatively. Code will be released upon acceptance.
