MAMM: Motion Control via Metric-Aligning Motion Matching
Naoki Agata, Takeo Igarashi
TL;DR
This work tackles cross-domain motion control by aligning a source motion to arbitrary control sequences without paired data. It introduces Metric-Aligning Motion Matching (MAMM), a unified framework built on fused semi-unbalanced Gromov-Wasserstein (FSUGW) optimization that preserves within-domain structure while matching cross-domain control cues through motion patches. The method supports diverse controls—sketches, waveforms, labels, audio, and even other motions—and employs a coarse-to-fine strategy, patch extraction/blending, and soft constraints (soft-keyframes) to ensure temporally coherent results. The approach yields fast, data-efficient alignments and enables novel applications like waveform-to-motion, sketch-to-motion, motion-by-numbers, and audio-driven control, with demonstrated cross-skeletal alignment and practical user interfaces. Overall, MAMM offers a flexible, scalable, and data-efficient solution for versatile motion control tasks with minimal domain-specific engineering.
Abstract
We introduce a novel method for controlling a motion sequence using an arbitrary temporal control sequence using temporal alignment. Temporal alignment of motion has gained significant attention owing to its applications in motion control and retargeting. Traditional methods rely on either learned or hand-craft cross-domain mappings between frames in the original and control domains, which often require large, paired, or annotated datasets and time-consuming training. Our approach, named Metric-Aligning Motion Matching, achieves alignment by solely considering within-domain distances. It computes distances among patches in each domain and seeks a matching that optimally aligns the two within-domain distances. This framework allows for the alignment of a motion sequence to various types of control sequences, including sketches, labels, audio, and another motion sequence, all without the need for manually defined mappings or training with annotated data. We demonstrate the effectiveness of our approach through applications in efficient motion control, showcasing its potential in practical scenarios.
