Table of Contents
Fetching ...

ESCAPE: Equivariant Shape Completion via Anchor Point Encoding

Burak Bekci, Nassir Navab, Federico Tombari, Mahdi Saleh

TL;DR

ESCAPE addresses the challenge of rotation-robust 3D shape completion without pose estimation by introducing an anchor-point distance encoding. Points are described by distances to $k$ anchor points, forming a distance matrix $D_p \in \mathbb{R}^{n\times k}$ that an encoder-decoder transformer processes, followed by a Levenberg-Marquardt optimization to recover coordinates. The framework provides reconstruction uniqueness when $k \ge d+1$ and achieves constant error bounds for the distance-based representation, yielding rotation-equivariant outputs validated on PCN, OmniObject3D, and KITTI cars. Empirical results demonstrate robust, high-fidelity completions under arbitrary rotations and partiality, outperforming canonical-alignment baselines and enabling practical deployment in dynamic environments without additional pose-estimation modules.

Abstract

Shape completion, a crucial task in 3D computer vision, involves predicting and filling the missing regions of scanned or partially observed objects. Current methods expect known pose or canonical coordinates and do not perform well under varying rotations, limiting their real-world applicability. We introduce ESCAPE (Equivariant Shape Completion via Anchor Point Encoding), a novel framework designed to achieve rotation-equivariant shape completion. Our approach employs a distinctive encoding strategy by selecting anchor points from a shape and representing all points as a distance to all anchor points. This enables the model to capture a consistent, rotation-equivariant understanding of the object's geometry. ESCAPE leverages a transformer architecture to encode and decode the distance transformations, ensuring that generated shape completions remain accurate and equivariant under rotational transformations. Subsequently, we perform optimization to calculate the predicted shapes from the encodings. Experimental evaluations demonstrate that ESCAPE achieves robust, high-quality reconstructions across arbitrary rotations and translations, showcasing its effectiveness in real-world applications without additional pose estimation modules.

ESCAPE: Equivariant Shape Completion via Anchor Point Encoding

TL;DR

ESCAPE addresses the challenge of rotation-robust 3D shape completion without pose estimation by introducing an anchor-point distance encoding. Points are described by distances to anchor points, forming a distance matrix that an encoder-decoder transformer processes, followed by a Levenberg-Marquardt optimization to recover coordinates. The framework provides reconstruction uniqueness when and achieves constant error bounds for the distance-based representation, yielding rotation-equivariant outputs validated on PCN, OmniObject3D, and KITTI cars. Empirical results demonstrate robust, high-fidelity completions under arbitrary rotations and partiality, outperforming canonical-alignment baselines and enabling practical deployment in dynamic environments without additional pose-estimation modules.

Abstract

Shape completion, a crucial task in 3D computer vision, involves predicting and filling the missing regions of scanned or partially observed objects. Current methods expect known pose or canonical coordinates and do not perform well under varying rotations, limiting their real-world applicability. We introduce ESCAPE (Equivariant Shape Completion via Anchor Point Encoding), a novel framework designed to achieve rotation-equivariant shape completion. Our approach employs a distinctive encoding strategy by selecting anchor points from a shape and representing all points as a distance to all anchor points. This enables the model to capture a consistent, rotation-equivariant understanding of the object's geometry. ESCAPE leverages a transformer architecture to encode and decode the distance transformations, ensuring that generated shape completions remain accurate and equivariant under rotational transformations. Subsequently, we perform optimization to calculate the predicted shapes from the encodings. Experimental evaluations demonstrate that ESCAPE achieves robust, high-quality reconstructions across arbitrary rotations and translations, showcasing its effectiveness in real-world applications without additional pose estimation modules.

Paper Structure

This paper contains 35 sections, 1 theorem, 10 equations, 9 figures, 12 tables.

Key Result

Theorem A.1

For input perturbation $\boldsymbol{\epsilon}$, our distance matrix representation maintains constant error bounds: independent of network depth.

Figures (9)

  • Figure 1: Existing shape completion methods (Top) use conventional canonical coordinates and perform poorly under rotation changes and unknown canonical reference. Using our anchor point encoding (Below), we manage to consistently complete the shape with arbitrary rotation in non-canonical coordinates.
  • Figure 2: The overall pipeline for ESCAPE model. Initially, we extract $k$ anchor points to construct rotation-invariant features as input to a transformer-based encoder-decoder architecture. The transformer is specially modified to predict the distance between points in the complete geometry and the extracted anchor points. It simultaneously constructs complete geometry with $m$ points and predicts the distance to anchor points. Finally, an optimization procedure has been utilized to find the coordinates of the complete shape.
  • Figure 3: Qualitative comparison of models trained on PCN dataset and tested with rotated inputs. Each row contains the input to the model in its first column. Every even row contains the rotated input of the preceding row.
  • Figure 4: Qualitative comparison of models trained on PCN dataset and tested on OmniObject dataset. Each row contains the input to the model in its first column.
  • Figure 5: Qualitative comparison of SCARP and ESCAPE on PCN dataset.
  • ...and 4 more figures

Theorems & Definitions (1)

  • Theorem A.1: Error Bounds