Table of Contents
Fetching ...

Large-Scale Multi-Character Interaction Synthesis

Ziyi Chang, He Wang, George Alex Koulieris, Hubert P. H. Shum

TL;DR

The paper tackles the problem of generating large-scale multi-character interactions with dense, coordinated transitions under limited multi-character data. It introduces a two-component autoregressive framework: a coordinatable interaction space built by partitioning characters into two-character groups and generating group interactions with a pre-trained diffusion model, and a transition planning network that outputs regrouping plans to steer future interactions, trained via reinforcement learning. The approach achieves smoother transitions and reduces character overlap while demonstrating scalability to more characters and transferability to other motion types (e.g., boxing). This data-efficient, modular pipeline offers a practical path to realistic social interactions in animation and interactive scenes without requiring extensive multi-character datasets.

Abstract

Generating large-scale multi-character interactions is a challenging and important task in character animation. Multi-character interactions involve not only natural interactive motions but also characters coordinated with each other for transition. For example, a dance scenario involves characters dancing with partners and also characters coordinated to new partners based on spatial and temporal observations. We term such transitions as coordinated interactions and decompose them into interaction synthesis and transition planning. Previous methods of single-character animation do not consider interactions that are critical for multiple characters. Deep-learning-based interaction synthesis usually focuses on two characters and does not consider transition planning. Optimization-based interaction synthesis relies on manually designing objective functions that may not generalize well. While crowd simulation involves more characters, their interactions are sparse and passive. We identify two challenges to multi-character interaction synthesis, including the lack of data and the planning of transitions among close and dense interactions. Existing datasets either do not have multiple characters or do not have close and dense interactions. The planning of transitions for multi-character close and dense interactions needs both spatial and temporal considerations. We propose a conditional generative pipeline comprising a coordinatable multi-character interaction space for interaction synthesis and a transition planning network for coordinations. Our experiments demonstrate the effectiveness of our proposed pipeline for multicharacter interaction synthesis and the applications facilitated by our method show the scalability and transferability.

Large-Scale Multi-Character Interaction Synthesis

TL;DR

The paper tackles the problem of generating large-scale multi-character interactions with dense, coordinated transitions under limited multi-character data. It introduces a two-component autoregressive framework: a coordinatable interaction space built by partitioning characters into two-character groups and generating group interactions with a pre-trained diffusion model, and a transition planning network that outputs regrouping plans to steer future interactions, trained via reinforcement learning. The approach achieves smoother transitions and reduces character overlap while demonstrating scalability to more characters and transferability to other motion types (e.g., boxing). This data-efficient, modular pipeline offers a practical path to realistic social interactions in animation and interactive scenes without requiring extensive multi-character datasets.

Abstract

Generating large-scale multi-character interactions is a challenging and important task in character animation. Multi-character interactions involve not only natural interactive motions but also characters coordinated with each other for transition. For example, a dance scenario involves characters dancing with partners and also characters coordinated to new partners based on spatial and temporal observations. We term such transitions as coordinated interactions and decompose them into interaction synthesis and transition planning. Previous methods of single-character animation do not consider interactions that are critical for multiple characters. Deep-learning-based interaction synthesis usually focuses on two characters and does not consider transition planning. Optimization-based interaction synthesis relies on manually designing objective functions that may not generalize well. While crowd simulation involves more characters, their interactions are sparse and passive. We identify two challenges to multi-character interaction synthesis, including the lack of data and the planning of transitions among close and dense interactions. Existing datasets either do not have multiple characters or do not have close and dense interactions. The planning of transitions for multi-character close and dense interactions needs both spatial and temporal considerations. We propose a conditional generative pipeline comprising a coordinatable multi-character interaction space for interaction synthesis and a transition planning network for coordinations. Our experiments demonstrate the effectiveness of our proposed pipeline for multicharacter interaction synthesis and the applications facilitated by our method show the scalability and transferability.

Paper Structure

This paper contains 17 sections, 17 equations, 9 figures, 2 tables, 1 algorithm.

Figures (9)

  • Figure 1: Framework overview. Our pipeline is an autoregressive conditional generative model to plan transitions and synthesize interactions for multiple characters. It has two components: The first component divides multiple characters into groups and leverages a pre-trained diffusion-based model to autoregressively generate interactions for each group. The second component predicts a transition plan based on the observed interactions and serves as the conditional signal for the interaction synthesis.
  • Figure 2: Coordinatable multi-character interaction space by group division. We divide multiple characters into groups and re-group them for potential coordination. The group synthesis generates new motions group by group. The newly generated group is conditioned on the already generated ones, which is indicated by red arrows.
  • Figure 3: The planning network is learned as a policy network via deep reinforcement learning. The action is a transition plan that contains a high-level grouping choice.
  • Figure 4: (a) An example result from our method. (b) An example from InterGen where characters heavily overlap.
  • Figure 5: The density of hip distance for the three methods evaluated. The two modes in our hip distance density demonstrate minimal character overlap and clear transitions. InterGen$\dagger$ does not have the ability of transition planning, leading to an averaged distance density with a single mode. InterGen has a similar curve shape with InterGen$\dagger$ as both of them do not have transition planning. Its much smaller mode value indicates that characters heavily overlap.
  • ...and 4 more figures