Table of Contents
Fetching ...

MAMM: Motion Control via Metric-Aligning Motion Matching

Naoki Agata, Takeo Igarashi

TL;DR

This work tackles cross-domain motion control by aligning a source motion to arbitrary control sequences without paired data. It introduces Metric-Aligning Motion Matching (MAMM), a unified framework built on fused semi-unbalanced Gromov-Wasserstein (FSUGW) optimization that preserves within-domain structure while matching cross-domain control cues through motion patches. The method supports diverse controls—sketches, waveforms, labels, audio, and even other motions—and employs a coarse-to-fine strategy, patch extraction/blending, and soft constraints (soft-keyframes) to ensure temporally coherent results. The approach yields fast, data-efficient alignments and enables novel applications like waveform-to-motion, sketch-to-motion, motion-by-numbers, and audio-driven control, with demonstrated cross-skeletal alignment and practical user interfaces. Overall, MAMM offers a flexible, scalable, and data-efficient solution for versatile motion control tasks with minimal domain-specific engineering.

Abstract

We introduce a novel method for controlling a motion sequence using an arbitrary temporal control sequence using temporal alignment. Temporal alignment of motion has gained significant attention owing to its applications in motion control and retargeting. Traditional methods rely on either learned or hand-craft cross-domain mappings between frames in the original and control domains, which often require large, paired, or annotated datasets and time-consuming training. Our approach, named Metric-Aligning Motion Matching, achieves alignment by solely considering within-domain distances. It computes distances among patches in each domain and seeks a matching that optimally aligns the two within-domain distances. This framework allows for the alignment of a motion sequence to various types of control sequences, including sketches, labels, audio, and another motion sequence, all without the need for manually defined mappings or training with annotated data. We demonstrate the effectiveness of our approach through applications in efficient motion control, showcasing its potential in practical scenarios.

MAMM: Motion Control via Metric-Aligning Motion Matching

TL;DR

This work tackles cross-domain motion control by aligning a source motion to arbitrary control sequences without paired data. It introduces Metric-Aligning Motion Matching (MAMM), a unified framework built on fused semi-unbalanced Gromov-Wasserstein (FSUGW) optimization that preserves within-domain structure while matching cross-domain control cues through motion patches. The method supports diverse controls—sketches, waveforms, labels, audio, and even other motions—and employs a coarse-to-fine strategy, patch extraction/blending, and soft constraints (soft-keyframes) to ensure temporally coherent results. The approach yields fast, data-efficient alignments and enables novel applications like waveform-to-motion, sketch-to-motion, motion-by-numbers, and audio-driven control, with demonstrated cross-skeletal alignment and practical user interfaces. Overall, MAMM offers a flexible, scalable, and data-efficient solution for versatile motion control tasks with minimal domain-specific engineering.

Abstract

We introduce a novel method for controlling a motion sequence using an arbitrary temporal control sequence using temporal alignment. Temporal alignment of motion has gained significant attention owing to its applications in motion control and retargeting. Traditional methods rely on either learned or hand-craft cross-domain mappings between frames in the original and control domains, which often require large, paired, or annotated datasets and time-consuming training. Our approach, named Metric-Aligning Motion Matching, achieves alignment by solely considering within-domain distances. It computes distances among patches in each domain and seeks a matching that optimally aligns the two within-domain distances. This framework allows for the alignment of a motion sequence to various types of control sequences, including sketches, labels, audio, and another motion sequence, all without the need for manually defined mappings or training with annotated data. We demonstrate the effectiveness of our approach through applications in efficient motion control, showcasing its potential in practical scenarios.

Paper Structure

This paper contains 36 sections, 4 equations, 10 figures, 1 algorithm.

Figures (10)

  • Figure 1: Intuitive concept of MAMM framework. Our framework optimizes transport plan $T$ and aligned motion sequence $X'$, whose patch pairwise distance matrix is similar to that of control sequence $Y$, while resembling original motion $X$.
  • Figure 2: Explanation of the fused semi-unbalanced Gromov-Wasserstein (FSUGW) objective and algorithm to minimize it. $L_W$ constrains $X'$ to resemble $X$ via transport plan $T$ and $L_{GW}$ encourages $T$ to be metric-aligning, which leads to structural similarity between $X'$ and $Y$. We optimize FSUGW objective with alternating steps, where the first step optimizes $T$ with $X'$ fixed, and second step optimizes $X'$ over fixed $T$.
  • Figure 3: Demonstration of character motion control using sketched curve. Character's movement follows abstract structure of the curve. Additional examples are available in supplementary video.
  • Figure 4: Example of controlling aligned motion with soft keyframes. (a) Users can specify keyframe poses, such as "hands-up horse rider" and "hands-down horse rider," using our interface. For simplicity, users select poses from original sequence, although our method does not impose strict constraints on keyframe selection. (b) When motion curve approaches keyframe, algorithm ensures that corresponding poses in aligned motion closely match keyframe poses. (c) Conversely, when curve is distant from keyframe, algorithm selects alternative poses, such as "side-step" poses, which differ from keyframe poses. This illustrates how soft keyframes can influence motion in both positive and negative contexts.
  • Figure 5: Examples of controlling motion by waveforms. We used (1) frequency-varying sine wave and (2) center-alternating sine wave to control the dance motion's frequency and phase. For more examples, please refer to the attached video.
  • ...and 5 more figures