Table of Contents
Fetching ...

CRA-PCN: Point Cloud Completion with Intra- and Inter-level Cross-Resolution Transformers

Yi Rong, Haoran Zhou, Lixin Yuan, Cheng Mei, Jiahao Wang, Tong Lu

TL;DR

CRA-PCN addresses the challenge of incomplete point clouds by introducing Cross-Resolution Transformer (CRT) that performs explicit cross-resolution aggregation with local attention across multiple scales. CRT comes in inter-level and intra-level forms, sharing a unified implementation and is embedded into a three-block up-sampling decoder to form a coarse-to-fine CRA-PCN encoder–decoder. Extensive experiments on PCN, ShapeNet-55/34, and MVP show state-of-the-art performance and reveal the critical benefit of combining intra- and inter-level CRA along with recursive multi-scale design. The proposed approach provides a plug-in, efficient mechanism for learning rich multi-scale local geometry, with strong generalization and practical impact for 3D reconstruction tasks.

Abstract

Point cloud completion is an indispensable task for recovering complete point clouds due to incompleteness caused by occlusion, limited sensor resolution, etc. The family of coarse-to-fine generation architectures has recently exhibited great success in point cloud completion and gradually became mainstream. In this work, we unveil one of the key ingredients behind these methods: meticulously devised feature extraction operations with explicit cross-resolution aggregation. We present Cross-Resolution Transformer that efficiently performs cross-resolution aggregation with local attention mechanisms. With the help of our recursive designs, the proposed operation can capture more scales of features than common aggregation operations, which is beneficial for capturing fine geometric characteristics. While prior methodologies have ventured into various manifestations of inter-level cross-resolution aggregation, the effectiveness of intra-level one and their combination has not been analyzed. With unified designs, Cross-Resolution Transformer can perform intra- or inter-level cross-resolution aggregation by switching inputs. We integrate two forms of Cross-Resolution Transformers into one up-sampling block for point generation, and following the coarse-to-fine manner, we construct CRA-PCN to incrementally predict complete shapes with stacked up-sampling blocks. Extensive experiments demonstrate that our method outperforms state-of-the-art methods by a large margin on several widely used benchmarks. Codes are available at https://github.com/EasyRy/CRA-PCN.

CRA-PCN: Point Cloud Completion with Intra- and Inter-level Cross-Resolution Transformers

TL;DR

CRA-PCN addresses the challenge of incomplete point clouds by introducing Cross-Resolution Transformer (CRT) that performs explicit cross-resolution aggregation with local attention across multiple scales. CRT comes in inter-level and intra-level forms, sharing a unified implementation and is embedded into a three-block up-sampling decoder to form a coarse-to-fine CRA-PCN encoder–decoder. Extensive experiments on PCN, ShapeNet-55/34, and MVP show state-of-the-art performance and reveal the critical benefit of combining intra- and inter-level CRA along with recursive multi-scale design. The proposed approach provides a plug-in, efficient mechanism for learning rich multi-scale local geometry, with strong generalization and practical impact for 3D reconstruction tasks.

Abstract

Point cloud completion is an indispensable task for recovering complete point clouds due to incompleteness caused by occlusion, limited sensor resolution, etc. The family of coarse-to-fine generation architectures has recently exhibited great success in point cloud completion and gradually became mainstream. In this work, we unveil one of the key ingredients behind these methods: meticulously devised feature extraction operations with explicit cross-resolution aggregation. We present Cross-Resolution Transformer that efficiently performs cross-resolution aggregation with local attention mechanisms. With the help of our recursive designs, the proposed operation can capture more scales of features than common aggregation operations, which is beneficial for capturing fine geometric characteristics. While prior methodologies have ventured into various manifestations of inter-level cross-resolution aggregation, the effectiveness of intra-level one and their combination has not been analyzed. With unified designs, Cross-Resolution Transformer can perform intra- or inter-level cross-resolution aggregation by switching inputs. We integrate two forms of Cross-Resolution Transformers into one up-sampling block for point generation, and following the coarse-to-fine manner, we construct CRA-PCN to incrementally predict complete shapes with stacked up-sampling blocks. Extensive experiments demonstrate that our method outperforms state-of-the-art methods by a large margin on several widely used benchmarks. Codes are available at https://github.com/EasyRy/CRA-PCN.
Paper Structure (31 sections, 4 equations, 4 figures, 7 tables)

This paper contains 31 sections, 4 equations, 4 figures, 7 tables.

Figures (4)

  • Figure 1: Illustration of our main idea. Here, we analyze several point generation methods from the perspective of cross-resolution aggregation (CRA). (a) Common pipeline of coarse-to-fine completion approaches. (b) The plain generation operation simply generates points without considering explicit CRA. (c) & (d) Several methods exploit skip connections to aggregate features of other generated point clouds or partial inputs for the current point cloud, which can efficiently capture multi-scale features. (e) Our method not only extracts more fruitful multi-scale features with novel-designed enhanced inter-level CRA but also combines intra-and inter-level CRA for better capturing geometric characteristics.
  • Figure 2: Illustration of Cross-Resolution Transformer (CRT). CRT considers the cross-resolution aggregation on $m$ scales, and $m=3$ in this figure. (a) Inter-level CRT lets the query point cloud (blue) aggregate features from the support one (green); both of them are intermediate point clouds during the generation phase. (b) Intra-level CRT realizes cross-resolution aggregation inside the current point cloud and is a degenerate form of inter-level one.
  • Figure 3: (a) The overall architecture of CRA-PCN, which consists of encoder, seed generator, and stacked up-sampling blocks. (b) The details of the up-sampling block, which is composed of MLP, inter-level Cross-Resolution Transformer, intra-level Cross-Resolution Transformer, and deconvolution.
  • Figure 4: Visual comparison on PCN dataset.