Table of Contents
Fetching ...

Equivariant vs. Invariant Layers: A Comparison of Backbone and Pooling for Point Cloud Classification

Abihith Kothapalli, Ashkan Shahbazi, Xinran Liu, Robert Sheng, Soheil Kolouri

TL;DR

This work addresses how to design permutation-invariant models for point cloud classification by disentangling the interaction between permutation **equivariant** backbones and permutation **invariant** pooling. It evaluates 77 backbone–pooling combinations across three benchmarks using a unified training/evaluation protocol to isolate architectural effects. The key findings show that transport-based pooling and attention-based pooling provide substantial gains for simple backbones, with OT-based pooling showing robustness in low-data regimes; moreover, the pooling choice can dominate improvements from deeper or wider backbones, and pairing complementary pooling layers yields additional boosts. These insights offer practical guidelines for constructing robust, permutation-invariant point cloud classifiers and point to future work on rotation invariance and newer backbone designs. In particular, the analysis leverages the permutation group $\\mathcal{G}$ of all point permutations to frame the invariant design, and highlights the practical trade-offs between pooling richness, data availability, and backbone complexity.

Abstract

Learning from set-structured data, such as point clouds, has gained significant attention from the machine learning community. Geometric deep learning provides a blueprint for designing effective set neural networks that preserve the permutation symmetry of set-structured data. Of our interest are permutation invariant networks, which are composed of a permutation equivariant backbone, permutation invariant global pooling, and regression/classification head. While existing literature has focused on improving equivariant backbones, the impact of the pooling layer is often overlooked. In this paper, we examine the interplay between permutation equivariant backbones and permutation invariant global pooling on three benchmark point cloud classification datasets. Our findings reveal that: 1) complex pooling methods, such as transport-based or attention-based poolings, can significantly boost the performance of simple backbones, but the benefits diminish for more complex backbones, 2) even complex backbones can benefit from pooling layers in low data scenarios, 3) surprisingly, the choice of pooling layers can have a more significant impact on the model's performance than adjusting the width and depth of the backbone, and 4) pairwise combination of pooling layers can significantly improve the performance of a fixed backbone. Our comprehensive study provides insights for practitioners to design better permutation invariant set neural networks. Our code is available at https://github.com/mint-vu/backbone_vs_pooling.

Equivariant vs. Invariant Layers: A Comparison of Backbone and Pooling for Point Cloud Classification

TL;DR

This work addresses how to design permutation-invariant models for point cloud classification by disentangling the interaction between permutation **equivariant** backbones and permutation **invariant** pooling. It evaluates 77 backbone–pooling combinations across three benchmarks using a unified training/evaluation protocol to isolate architectural effects. The key findings show that transport-based pooling and attention-based pooling provide substantial gains for simple backbones, with OT-based pooling showing robustness in low-data regimes; moreover, the pooling choice can dominate improvements from deeper or wider backbones, and pairing complementary pooling layers yields additional boosts. These insights offer practical guidelines for constructing robust, permutation-invariant point cloud classifiers and point to future work on rotation invariance and newer backbone designs. In particular, the analysis leverages the permutation group of all point permutations to frame the invariant design, and highlights the practical trade-offs between pooling richness, data availability, and backbone complexity.

Abstract

Learning from set-structured data, such as point clouds, has gained significant attention from the machine learning community. Geometric deep learning provides a blueprint for designing effective set neural networks that preserve the permutation symmetry of set-structured data. Of our interest are permutation invariant networks, which are composed of a permutation equivariant backbone, permutation invariant global pooling, and regression/classification head. While existing literature has focused on improving equivariant backbones, the impact of the pooling layer is often overlooked. In this paper, we examine the interplay between permutation equivariant backbones and permutation invariant global pooling on three benchmark point cloud classification datasets. Our findings reveal that: 1) complex pooling methods, such as transport-based or attention-based poolings, can significantly boost the performance of simple backbones, but the benefits diminish for more complex backbones, 2) even complex backbones can benefit from pooling layers in low data scenarios, 3) surprisingly, the choice of pooling layers can have a more significant impact on the model's performance than adjusting the width and depth of the backbone, and 4) pairwise combination of pooling layers can significantly improve the performance of a fixed backbone. Our comprehensive study provides insights for practitioners to design better permutation invariant set neural networks. Our code is available at https://github.com/mint-vu/backbone_vs_pooling.
Paper Structure (22 sections, 4 equations, 8 figures, 1 table)

This paper contains 22 sections, 4 equations, 8 figures, 1 table.

Figures (8)

  • Figure 1: Investigating the Impact of Backbone and Pooling Combinations on Point Cloud Data. Through comprehensive experiments on three point cloud benchmarks, we evaluate the performance of models using different combinations of permutation equivariant backbones with permutation invariant pooling techniques. Additionally, we demonstrate the effectiveness of combining specific poolings to improve model performance. Our study provides insights into the benefits of equivariant and invariant layers for point cloud analysis.
  • Figure 2: Backbone vs. Pooling. This figure provides a visualization of the models' performances, reported in Table \ref{['tab:table(1)']}, as a function of the backbone complexity. We employed three indicators as proxies for backbone complexity: average forward time, average backward time, and model size. Each row represents the results from one of the datasets.
  • Figure 3: Backbone vs. Pooling under Limited Training Data. The models' performances on ModelNet40, when using 5% (left), 10% (middle), and 25% (right) of the training data. Notably, the OT-based methods show less sensitivity to the sample size.
  • Figure 4: Depth vs. Width. The performance of each pooling method on MLP backbones with varied widths and depths is illustrated in the figure. The results consistently highlight the superior performance of OT-based poolings, particularly F-PSWE and L-PSWE, across different backbone sizes. Additionally, attention-based poolings GMHA and MMHA exhibit competitive performance, closely trailing behind the OT-based methods.
  • Figure 5: Effect of Paired Pooling Methods. Test set classification accuracy for models trained with the same backbone and different pairs of pooling layers. Results are shown for MLP (left) and SAB (right) backbones. The bottom rows in both plots represent the performance of the same model with a single pooling layer.
  • ...and 3 more figures