UPTor: Unified 3D Human Pose Dynamics and Trajectory Prediction for Human-Robot Interaction
Nisarga Nilavadi, Andrey Rudenko, Timm Linder
TL;DR
UPTor tackles the problem of jointly predicting 3D human pose dynamics and global trajectories from a short pose sequence to support human-aware robot navigation. It introduces a motion transformation that places sequences in a global, orientation-aligned frame, a Graph Attention Network to encode skeletal structure, and a non-autoregressive Transformer to fuse spatial and temporal dynamics into unified predictions, all trained end-to-end. Evaluations on Human3.6M, CMU-Mocap, and the newly released DARKO dataset demonstrate competitive pose accuracy and improved trajectory prediction with a smaller, faster model suitable for real-time robotic use. The DARKO dataset and accompanying code aim to advance research in human-robot interaction and navigation in realistic, egocentric perception settings, with potential impact on intralogistics and service robots.
Abstract
We introduce a unified approach to forecast the dynamics of human keypoints along with the motion trajectory based on a short sequence of input poses. While many studies address either full-body pose prediction or motion trajectory prediction, only a few attempt to merge them. We propose a motion transformation technique to simultaneously predict full-body pose and trajectory key-points in a global coordinate frame. We utilize an off-the-shelf 3D human pose estimation module, a graph attention network to encode the skeleton structure, and a compact, non-autoregressive transformer suitable for real-time motion prediction for human-robot interaction and human-aware navigation. We introduce a human navigation dataset ``DARKO'' with specific focus on navigational activities that are relevant for human-aware mobile robot navigation. We perform extensive evaluation on Human3.6M, CMU-Mocap, and our DARKO dataset. In comparison to prior work, we show that our approach is compact, real-time, and accurate in predicting human navigation motion across all datasets. Result animations, our dataset, and code will be available at https://nisarganc.github.io/UPTor-page/
