Utilizing Motion Matching with Deep Reinforcement Learning for Target Location Tasks
Jeongmin Lee, Taesoo Kwon, Hyunju Shin, Yoonsang Lee
TL;DR
The paper addresses long-horizon target-location control for virtual characters by coupling motion matching with deep reinforcement learning to directly generate motion-matching queries. It treats each RL step as a motion-matching-and-playback cycle, with state $s_t = \{c_t, g_t\}$ and action $a_t = \{t_t\}$, where motion-matching features $f_i = \{c_i, t_i\} \in \mathbb{R}^{27}$ drive the next frame selection and a target location $\mathbf{g}_t$ guides progress, enabling efficient learning without full-body motion synthesis. A novel hit reward term $r_t = \exp(-\mathrm{dist}(s_t)) + \exp(-\mathrm{hits}(a_t))$ and a 10-stage obstacle curriculum (plus optional obstacle-sensing inputs) are introduced to improve learning in moving-obstacle environments, promoting safer trajectories within about $1$ second of lookahead. Experiments show policies can reach target locations with limited training time (e.g., as little as $0.2$ seconds per step) and ultimately require thousands to millions of steps ($\sim14\mathrm{M}$) to converge in complex scenes, highlighting the approach’s practicality for rapid animation development and interactive applications, while acknowledging memory and exploration speed constraints that future work may address with autoencoder-based feature learning and related techniques.
Abstract
We present an approach using deep reinforcement learning (DRL) to directly generate motion matching queries for long-term tasks, particularly targeting the reaching of specific locations. By integrating motion matching and DRL, our method demonstrates the rapid learning of policies for target location tasks within minutes on a standard desktop, employing a simple reward design. Additionally, we propose a unique hit reward and obstacle curriculum scheme to enhance policy learning in environments with moving obstacles.
