NARF24: Estimating Articulated Object Structure for Implicit Rendering
Stanley Lewis, Tom Gao, Odest Chadwicke Jenkins
TL;DR
NARF24 tackles the challenge of articulated-object understanding for robots by learning a shared NeRF across a few configurations and using image-space part segmentations to infer joint parameters. The method builds per-part point clouds, registers them across scenes with ICP and Teaser++, and estimates joint connectivity and type via Chamfer-distance comparisons to enable URDF-like modeling and configuration-conditioned rendering in an articulation-aware NeRF. Real-world experiments (including a sparse-label scenario) and a simulated 6-DOF arm demonstrate that accurate articulation estimation and configurable rendering are achievable with limited segmentation data. The approach promises scalable articulation modeling by combining NeRF with parts-based segmentation, registration, and classical joint-estimation techniques.
Abstract
Articulated objects and their representations pose a difficult problem for robots. These objects require not only representations of geometry and texture, but also of the various connections and joint parameters that make up each articulation. We propose a method that learns a common Neural Radiance Field (NeRF) representation across a small number of collected scenes. This representation is combined with a parts-based image segmentation to produce an implicit space part localization, from which the connectivity and joint parameters of the articulated object can be estimated, thus enabling configuration-conditioned rendering.
