Table of Contents
Fetching ...

AutoURDF: Unsupervised Robot Modeling from Point Cloud Frames Using Cluster Registration

Jiong Lin, Lechen Zhang, Kwansoo Lee, Jialong Ning, Judah Goldfeder, Hod Lipson

TL;DR

AutoURDF tackles unsupervised robotic modeling from visual data by deriving URDF descriptions directly from time-series point clouds. It introduces a cluster-based 6-DoF registration framework that tracks multiple point clusters, segments moving parts, infers a tree-structured topology via MST, and estimates joint parameters to output URDFs compatible with simulators, validated on synthetic and real data. Across experiments, it outperforms prior methods in registration accuracy and topology inference while delivering fast end-to-end performance. This approach enables scalable, annotation-free self-modeling of diverse robotic morphologies for simulation, control, and planning.

Abstract

Robot description models are essential for simulation and control, yet their creation often requires significant manual effort. To streamline this modeling process, we introduce AutoURDF, an unsupervised approach for constructing description files for unseen robots from point cloud frames. Our method leverages a cluster-based point cloud registration model that tracks the 6-DoF transformations of point clusters. Through analyzing cluster movements, we hierarchically address the following challenges: (1) moving part segmentation, (2) body topology inference, and (3) joint parameter estimation. The complete pipeline produces robot description files that are fully compatible with existing simulators. We validate our method across a variety of robots, using both synthetic and real-world scan data. Results indicate that our approach outperforms previous methods in registration and body topology estimation accuracy, offering a scalable solution for automated robot modeling.

AutoURDF: Unsupervised Robot Modeling from Point Cloud Frames Using Cluster Registration

TL;DR

AutoURDF tackles unsupervised robotic modeling from visual data by deriving URDF descriptions directly from time-series point clouds. It introduces a cluster-based 6-DoF registration framework that tracks multiple point clusters, segments moving parts, infers a tree-structured topology via MST, and estimates joint parameters to output URDFs compatible with simulators, validated on synthetic and real data. Across experiments, it outperforms prior methods in registration accuracy and topology inference while delivering fast end-to-end performance. This approach enables scalable, annotation-free self-modeling of diverse robotic morphologies for simulation, control, and planning.

Abstract

Robot description models are essential for simulation and control, yet their creation often requires significant manual effort. To streamline this modeling process, we introduce AutoURDF, an unsupervised approach for constructing description files for unseen robots from point cloud frames. Our method leverages a cluster-based point cloud registration model that tracks the 6-DoF transformations of point clusters. Through analyzing cluster movements, we hierarchically address the following challenges: (1) moving part segmentation, (2) body topology inference, and (3) joint parameter estimation. The complete pipeline produces robot description files that are fully compatible with existing simulators. We validate our method across a variety of robots, using both synthetic and real-world scan data. Results indicate that our approach outperforms previous methods in registration and body topology estimation accuracy, offering a scalable solution for automated robot modeling.

Paper Structure

This paper contains 13 sections, 3 equations, 11 figures, 3 tables, 1 algorithm.

Figures (11)

  • Figure 1: We present AutoURDF, a novel framework that derives robot description files from time-series point cloud frames. We validate our method across a diverse range of robots, including both synthetic and real-world scan data.
  • Figure 2: AutoURDF overview. Our method provides a complete pipeline for the automated construction of robot description files. (A). Data Collection: By commanding robots with randomly sampled motor angle sequences, we capture the corresponding time-series point cloud frames. (B). Three Substeps: We tackle the problem in three substeps: 1) part segmentation, 2) body topology inference, and 3) joint parameter estimation. (C). Description File Generation: The final output is a URDF file that defines the robot's links, joints, and collision properties. We successfully build and simulate the description model for a WX200 robot arm from real-world scan data.
  • Figure 3: Point Cluster Registration and Part Segmentation. We tackle the part segmentation problem by extracting 6-DoF transformations from the registration of a set of point clusters. For each pair of consecutive frames, we begin by initializing point clusters using the K-means clustering algorithm. Each point cluster is assigned a 6-DoF coordinate, ${X^t_i}$, predicted from the previous registration step. Next, we employ a shared neural network as the Step Model to predict the target frame's point cluster coordinates, ${X^{t+1}_i}$. Following the rotation representation study geist2024learningbregier2021deepzhou2019continuity, we represent it as a 7-dimensional vector, combining Cartesian coordinates and quaternion orientation. The predicted coordinates are passed through an Anchor Model that computes cluster transformations, starting from the initial coordinates ${X^1}$. This process is repeated across frames in the sequence, tracing the motion of a set of 6-DoF coordinate frames. Finally, based on the time-series 6-DoF transformations, we compute a correlation matrix between the point clusters to segment distinct moving parts.
  • Figure 4: Topology Inference. We infer the robot's body topology by merging the result of the MST algorithm on position terms $x$ and the segmented point clusters. In the segmentation graph, the nodes represent the center positions of the clusters; the edges indicate the motion correlated between the connected nodes.
  • Figure 5: Point Cloud to Mesh. For each segmented link, we integrate the sparse point cloud data from the world coordinate system at each time step to form a dense point cloud in the local frame. The dense point cloud is constructed by combining 10 frames of data. We then apply the marching cubeslorensen1987marching algorithm to convert this dense point cloud into a mesh file. The resulting mesh files are high-quality and watertight, as demonstrated by the example of the WX200 robot arm's end-effector shown in the figure.
  • ...and 6 more figures