EverybodyDance: Bipartite Graph-Based Identity Correspondence for Multi-Character Animation
Haotian Ling, Zequn Chen, Qiuying Chen, Donglin Di, Yongjia Ma, Hao Li, Chen Wei, Zhulin Tao, Xun Yang
TL;DR
EverybodyDance tackles identity confusion in multi-character animation by explicitly modeling Identity Correspondence with the Identity Matching Graph (IMG) and edge affinities from Mask–Query Attention (MQA). It enhances IC robustness through Identity Embedded Guidance (IEG), Multi-Scale Matching (MSM), and Pre-Classified Sampling (PCS), and introduces the ICE benchmark for evaluating IC in complex scenes. Empirical results show superior IC accuracy and video fidelity over state-of-the-art baselines, including cross-scenario generalization and motion transfer. The work highlights a graph-based paradigm for disentangling multiple characters and guiding accurate identity mapping in diffusion-based video generation.
Abstract
Consistent pose-driven character animation has achieved remarkable progress in single-character scenarios. However, extending these advances to multi-character settings is non-trivial, especially when position swap is involved. Beyond mere scaling, the core challenge lies in enforcing correct Identity Correspondence (IC) between characters in reference and generated frames. To address this, we introduce EverybodyDance, a systematic solution targeting IC correctness in multi-character animation. EverybodyDance is built around the Identity Matching Graph (IMG), which models characters in the generated and reference frames as two node sets in a weighted complete bipartite graph. Edge weights, computed via our proposed Mask-Query Attention (MQA), quantify the affinity between each pair of characters. Our key insight is to formalize IC correctness as a graph structural metric and to optimize it during training. We also propose a series of targeted strategies tailored for multi-character animation, including identity-embedded guidance, a multi-scale matching strategy, and pre-classified sampling, which work synergistically. Finally, to evaluate IC performance, we curate the Identity Correspondence Evaluation benchmark, dedicated to multi-character IC correctness. Extensive experiments demonstrate that EverybodyDance substantially outperforms state-of-the-art baselines in both IC and visual fidelity.
