Table of Contents
Fetching ...

Deep Part Induction from Articulated Object Pairs

Li Yi, Haibin Huang, Difan Liu, Evangelos Kalogerakis, Hao Su, Leonidas Guibas

TL;DR

This work tackles mobility-based part induction from articulated object pairs by proposing a class-agnostic, data-driven pipeline that jointly learns correspondences, 3D deformation flows, and segmentation of moving parts. The approach combines three neural modules—Correspondence Proposal, Flow (PairNet), and Segmentation (RPEN)—and iteratively refines predictions in an ICP-like loop to reveal piecewise rigid structures despite geometric differences and noisy data. It is trained on a large synthetic dataset with ground-truth part correspondences and motions, and demonstrated to outperform state-of-the-art baselines on both synthetic and real datasets, with strong generalization to unseen object categories. The framework enables applications in shape animation and shape–image joint analysis, offering a robust tool for functional understanding of articulated objects.

Abstract

Object functionality is often expressed through part articulation -- as when the two rigid parts of a scissor pivot against each other to perform the cutting function. Such articulations are often similar across objects within the same functional category. In this paper, we explore how the observation of different articulation states provides evidence for part structure and motion of 3D objects. Our method takes as input a pair of unsegmented shapes representing two different articulation states of two functionally related objects, and induces their common parts along with their underlying rigid motion. This is a challenging setting, as we assume no prior shape structure, no prior shape category information, no consistent shape orientation, the articulation states may belong to objects of different geometry, plus we allow inputs to be noisy and partial scans, or point clouds lifted from RGB images. Our method learns a neural network architecture with three modules that respectively propose correspondences, estimate 3D deformation flows, and perform segmentation. To achieve optimal performance, our architecture alternates between correspondence, deformation flow, and segmentation prediction iteratively in an ICP-like fashion. Our results demonstrate that our method significantly outperforms state-of-the-art techniques in the task of discovering articulated parts of objects. In addition, our part induction is object-class agnostic and successfully generalizes to new and unseen objects.

Deep Part Induction from Articulated Object Pairs

TL;DR

This work tackles mobility-based part induction from articulated object pairs by proposing a class-agnostic, data-driven pipeline that jointly learns correspondences, 3D deformation flows, and segmentation of moving parts. The approach combines three neural modules—Correspondence Proposal, Flow (PairNet), and Segmentation (RPEN)—and iteratively refines predictions in an ICP-like loop to reveal piecewise rigid structures despite geometric differences and noisy data. It is trained on a large synthetic dataset with ground-truth part correspondences and motions, and demonstrated to outperform state-of-the-art baselines on both synthetic and real datasets, with strong generalization to unseen object categories. The framework enables applications in shape animation and shape–image joint analysis, offering a robust tool for functional understanding of articulated objects.

Abstract

Object functionality is often expressed through part articulation -- as when the two rigid parts of a scissor pivot against each other to perform the cutting function. Such articulations are often similar across objects within the same functional category. In this paper, we explore how the observation of different articulation states provides evidence for part structure and motion of 3D objects. Our method takes as input a pair of unsegmented shapes representing two different articulation states of two functionally related objects, and induces their common parts along with their underlying rigid motion. This is a challenging setting, as we assume no prior shape structure, no prior shape category information, no consistent shape orientation, the articulation states may belong to objects of different geometry, plus we allow inputs to be noisy and partial scans, or point clouds lifted from RGB images. Our method learns a neural network architecture with three modules that respectively propose correspondences, estimate 3D deformation flows, and perform segmentation. To achieve optimal performance, our architecture alternates between correspondence, deformation flow, and segmentation prediction iteratively in an ICP-like fashion. Our results demonstrate that our method significantly outperforms state-of-the-art techniques in the task of discovering articulated parts of objects. In addition, our part induction is object-class agnostic and successfully generalizes to new and unseen objects.

Paper Structure

This paper contains 61 sections, 9 equations, 12 figures, 10 tables.

Figures (12)

  • Figure 1: Correspondence proposal module. We use a PointNet++ based sub-module to extract point-wise features for the input point clouds. The learned features are further fed into a matching sub-module for correspondence proposal. The sub-module also predicts a correspondence mask that determines which points should be matched or not.
  • Figure 2: Flow Module. The refined matching probabilities are concatenated with the pairwise disparity and fed into the flow module. The flow module learns a point-wise deformation flow from one point set to the other.
  • Figure 3: Segmentation module. The predicted deformation flow on the first point set together with its point positions are fed into this module, which acts as a neural net-based, differentiable sequential RANSAC . Similar to sequential RANSAC, the inputs are processed through three sub-modules including hypothesis generation, verification and recurrent part extraction network, resulting in a set of soft segmentation indicator functions and part confidence scores.
  • Figure 4: Iterative refinement of deformation flow and segmentation. The outputs usually converge after 5 iterations.
  • Figure 5: Deformation flow visualization. We estimate a dense flow from the point set 1 to point set 2 and apply the flow to deform the point set 1. Deformation results are shown from (a) to (f): (a) Ours, (b) 3DFlow, (c) 3D match, (d) LMVCNN, (e) ED, (f) NRR. Colors on the deformed point set denote the flow error of each point, whose range is shown on the color bar at the right side of each row.
  • ...and 7 more figures