High-Degrees-of-Freedom Dynamic Neural Fields for Robot Self-Modeling and Motion Planning
Lennart Schulze, Hod Lipson
TL;DR
This paper tackles robot self-modeling without depth information by learning a kinematic representation from pose-annotated 2D images. It proposes a high-DOFs dynamic neural density field with an encoder-based architecture and curricular sampling to model an 8-DOF configuration conditioned on joint states, using a single camera with base rotation for quasi-multi-view consistency. The method yields a neural-implicit full-body self-model that supports differentiable forward predictions and enables inverse kinematics and configuration-space planning, achieving a Chamfer-L2 of $1.94\%$ of the workspace in a $1.254$ m tall workspace on a $7$-DOF robot. This approach enables autonomous self-modeling and motion planning without depth data and opens avenues for dynamic object-centric scenes and multi-robot settings.
Abstract
A robot self-model is a task-agnostic representation of the robot's physical morphology that can be used for motion planning tasks in the absence of a classical geometric kinematic model. In particular, when the latter is hard to engineer or the robot's kinematics change unexpectedly, human-free self-modeling is a necessary feature of truly autonomous agents. In this work, we leverage neural fields to allow a robot to self-model its kinematics as a neural-implicit query model learned only from 2D images annotated with camera poses and configurations. This enables significantly greater applicability than existing approaches which have been dependent on depth images or geometry knowledge. To this end, alongside a curricular data sampling strategy, we propose a new encoder-based neural density field architecture for dynamic object-centric scenes conditioned on high numbers of degrees of freedom (DOFs). In a 7-DOF robot test setup, the learned self-model achieves a Chamfer-L2 distance of 2% of the robot's workspace dimension. We demonstrate the capabilities of this model on motion planning tasks as an exemplary downstream application.
