Table of Contents
Fetching ...

Enhancing Robustness to Noise Corruption for Point Cloud Recognition via Spatial Sorting and Set-Mixing Aggregation Module

Dingxin Zhang, Jianhui Yu, Tengfei Xue, Chaoyi Zhang, Dongnan Liu, Weidong Cai

TL;DR

This work proposes Set-Mixer, a noise-robust aggregation module which facilitates communication among all points to extract geometric shape information and mitigating the influence of individual noise points, and optimizes model robustness to noise corruption through network architecture design.

Abstract

Current models for point cloud recognition demonstrate promising performance on synthetic datasets. However, real-world point cloud data inevitably contains noise, impacting model robustness. While recent efforts focus on enhancing robustness through various strategies, there still remains a gap in comprehensive analyzes from the standpoint of network architecture design. Unlike traditional methods that rely on generic techniques, our approach optimizes model robustness to noise corruption through network architecture design. Inspired by the token-mixing technique applied in 2D images, we propose Set-Mixer, a noise-robust aggregation module which facilitates communication among all points to extract geometric shape information and mitigating the influence of individual noise points. A sorting strategy is designed to enable our module to be invariant to point permutation, which also tackles the unordered structure of point cloud and introduces consistent relative spatial information. Experiments conducted on ModelNet40-C indicate that Set-Mixer significantly enhances the model performance on noisy point clouds, underscoring its potential to advance real-world applicability in 3D recognition and perception tasks.

Enhancing Robustness to Noise Corruption for Point Cloud Recognition via Spatial Sorting and Set-Mixing Aggregation Module

TL;DR

This work proposes Set-Mixer, a noise-robust aggregation module which facilitates communication among all points to extract geometric shape information and mitigating the influence of individual noise points, and optimizes model robustness to noise corruption through network architecture design.

Abstract

Current models for point cloud recognition demonstrate promising performance on synthetic datasets. However, real-world point cloud data inevitably contains noise, impacting model robustness. While recent efforts focus on enhancing robustness through various strategies, there still remains a gap in comprehensive analyzes from the standpoint of network architecture design. Unlike traditional methods that rely on generic techniques, our approach optimizes model robustness to noise corruption through network architecture design. Inspired by the token-mixing technique applied in 2D images, we propose Set-Mixer, a noise-robust aggregation module which facilitates communication among all points to extract geometric shape information and mitigating the influence of individual noise points. A sorting strategy is designed to enable our module to be invariant to point permutation, which also tackles the unordered structure of point cloud and introduces consistent relative spatial information. Experiments conducted on ModelNet40-C indicate that Set-Mixer significantly enhances the model performance on noisy point clouds, underscoring its potential to advance real-world applicability in 3D recognition and perception tasks.
Paper Structure (21 sections, 17 equations, 9 figures, 6 tables)

This paper contains 21 sections, 17 equations, 9 figures, 6 tables.

Figures (9)

  • Figure 1: Overview of the feature extraction processes of PointNet Pointnet, PointNet++ Pointnet2, PCT PCT, and our Set-Mixer. The red lines and squares represent the noisy relations and features. For clarity, noisy features are depicted on the far right for illustration, and they do not correspond to any specific channel.
  • Figure 2: Illustration of the proposed Set-Mixer architecture. We incorporated the Set-Mixer module into the PointNet++ structure, replacing the max-pooling function. For local point sets generated with FPS and KNN, we conducted sorting operations based on coordinate values for n times, indexing and concatenating corresponding features. The resulting feature matrix was then transposed and passed through the MLP Mixer layer, facilitating channel-wise cross-point learning and feature aggregation. Spatial centers of point sets are calculated for KNN grouping and sorting in the next level.
  • Figure 3: Sorting the point set along the z-axis and indexing the features, as an example of Axis Projection Sorting (APS).
  • Figure 4: Illustration of Plane Clockwise Sorting (PCS). Z-axis and X-axis are selected as $n$ and $v_{ref}$ respectively.
  • Figure A5: Visualization of query points and spatial center points of layer 1 $($512 sets, 32 neighbours$)$ and layer 2 $($128 sets, 64 neighbours$)$ of Set-Mixer. Query points are depicted in blue, while the spatial centers are marked in orange.
  • ...and 4 more figures