Table of Contents
Fetching ...

Non-rigid Structure-from-Motion: Temporally-smooth Procrustean Alignment and Spatially-variant Deformation Modeling

Jiawei Shi, Hui Deng, Yuchao Dai

TL;DR

This work tackles the persistent rotation and deformation ambiguities in non-rigid structure-from-motion by introducing a spatial-temporal framework that couples Temporally-smooth Procrustean Alignment (TPA) with a Spatial-Weighted Nuclear Norm (SWNN). TPA refines camera motion by aligning consecutive 3D shapes, leveraging temporal coherence and removing reliance on a single mean reference shape. SWNN uses kernel-based proxy shapes and nearly rigid region segmentation to adapt the low-rank prior to spatially varying deformations, enabling accurate recovery of drastic local changes. Extensive experiments on MoCap, NRSfM Challenge, Semi-dense, and H3WB datasets demonstrate superior reconstruction accuracy and robustness to occlusion, with ablations confirming the effectiveness of each component and the value of the joint spatial-temporal approach.

Abstract

Even though Non-rigid Structure-from-Motion (NRSfM) has been extensively studied and great progress has been made, there are still key challenges that hinder their broad real-world applications: 1) the inherent motion/rotation ambiguity requires either explicit camera motion recovery with extra constraint or complex Procrustean Alignment; 2) existing low-rank modeling of the global shape can over-penalize drastic deformations in the 3D shape sequence. This paper proposes to resolve the above issues from a spatial-temporal modeling perspective. First, we propose a novel Temporally-smooth Procrustean Alignment module that estimates 3D deforming shapes and adjusts the camera motion by aligning the 3D shape sequence consecutively. Our new alignment module remedies the requirement of complex reference 3D shape during alignment, which is more conductive to non-isotropic deformation modeling. Second, we propose a spatial-weighted approach to enforce the low-rank constraint adaptively at different locations to accommodate drastic spatially-variant deformation reconstruction better. Our modeling outperform existing low-rank based methods, and extensive experiments across different datasets validate the effectiveness of our method.

Non-rigid Structure-from-Motion: Temporally-smooth Procrustean Alignment and Spatially-variant Deformation Modeling

TL;DR

This work tackles the persistent rotation and deformation ambiguities in non-rigid structure-from-motion by introducing a spatial-temporal framework that couples Temporally-smooth Procrustean Alignment (TPA) with a Spatial-Weighted Nuclear Norm (SWNN). TPA refines camera motion by aligning consecutive 3D shapes, leveraging temporal coherence and removing reliance on a single mean reference shape. SWNN uses kernel-based proxy shapes and nearly rigid region segmentation to adapt the low-rank prior to spatially varying deformations, enabling accurate recovery of drastic local changes. Extensive experiments on MoCap, NRSfM Challenge, Semi-dense, and H3WB datasets demonstrate superior reconstruction accuracy and robustness to occlusion, with ablations confirming the effectiveness of each component and the value of the joint spatial-temporal approach.

Abstract

Even though Non-rigid Structure-from-Motion (NRSfM) has been extensively studied and great progress has been made, there are still key challenges that hinder their broad real-world applications: 1) the inherent motion/rotation ambiguity requires either explicit camera motion recovery with extra constraint or complex Procrustean Alignment; 2) existing low-rank modeling of the global shape can over-penalize drastic deformations in the 3D shape sequence. This paper proposes to resolve the above issues from a spatial-temporal modeling perspective. First, we propose a novel Temporally-smooth Procrustean Alignment module that estimates 3D deforming shapes and adjusts the camera motion by aligning the 3D shape sequence consecutively. Our new alignment module remedies the requirement of complex reference 3D shape during alignment, which is more conductive to non-isotropic deformation modeling. Second, we propose a spatial-weighted approach to enforce the low-rank constraint adaptively at different locations to accommodate drastic spatially-variant deformation reconstruction better. Our modeling outperform existing low-rank based methods, and extensive experiments across different datasets validate the effectiveness of our method.
Paper Structure (29 sections, 45 equations, 15 figures, 7 tables, 2 algorithms)

This paper contains 29 sections, 45 equations, 15 figures, 7 tables, 2 algorithms.

Figures (15)

  • Figure 1: Overview of our proposed TPA Module. (a) Camera motion can be recovered by orthographic and rank-3 constraints within the matrix factorization frameworkdai2014simple. (c) The Procrustean alignment framework uses GPA to resolve the rotation ambiguity. (b) We propose the TPA module, which aligns the 3D shapes of consecutive frames and corrects the camera motion by the temporal smoothing property.
  • Figure 2: Overview of region segmentation and proxy shape construction. (a) We use DFT to analyze the 3D trajectories, dividing them by comparing the frequency components contained in each trajectory. The figure shows a segmentation result. (b) The geometric center of the non-rigid region is set as the super point, and it is linearly combined with nearly rigid points to construct the proxy shape.
  • Figure 3: (a) Visualization of Shark and Dance results. (b) Visualization of Balloon and Articulated results. Top row is the image in dataset, and bottom row is the 3D reconstruction shape. (c) Visualization of Rug and Eating2 (Fixed-type) results.
  • Figure 4: Shape alignment test for the TPA module. We compare the low-rank and smoothing properties of GT non-rigid sequences, randomly rotated disrupted sequences, and sequences aligned by TPA or GPA. The results show that the TPA-aligned sequences have more similar properties to the GT sequences.
  • Figure 5: (a) Shape errors on noisy sequences. (b) Experiment for 3D reconstruction on missing data.
  • ...and 10 more figures