Table of Contents
Fetching ...

CloudMamba: Grouped Selective State Spaces for Point Cloud Analysis

Kanglin Qu, Pan Gao, Qun Dai, Zhanzhi Ye, Rui Ye, Yuanhao Sun

TL;DR

CloudMamba addresses three core challenges in Mamba-based point cloud analysis: imperfect serialization, limited high-level geometric perception, and overfitting from per-dimension parameters in S6. It introduces sequence expanding and sequence merging to build axis-aligned causal sequences, a chainedMamba to enhance global geometric perception, and GS6 to share parameters across dimensions, achieving state-of-the-art results on ModelNet40, ScanObjectNN, ShapeNet, and S3DIS with linear complexity $O(n)$. The approach combines a hexagonally oriented Mamba block with an encoder–decoder and efficient downsampling/upsampling, validated by extensive ablations. These advances enable robust, scalable point cloud understanding with improved geometry perception and computational efficiency, and point to future directions in self-supervised pre-training and structure-preserving serialization.

Abstract

Due to the long-range modeling ability and linear complexity property, Mamba has attracted considerable attention in point cloud analysis. Despite some interesting progress, related work still suffers from imperfect point cloud serialization, insufficient high-level geometric perception, and overfitting of the selective state space model (S6) at the core of Mamba. To this end, we resort to an SSM-based point cloud network termed CloudMamba to address the above challenges. Specifically, we propose sequence expanding and sequence merging, where the former serializes points along each axis separately and the latter serves to fuse the corresponding higher-order features causally inferred from different sequences, enabling unordered point sets to adapt more stably to the causal nature of Mamba without parameters. Meanwhile, we design chainedMamba that chains the forward and backward processes in the parallel bidirectional Mamba, capturing high-level geometric information during scanning. In addition, we propose a grouped selective state space model (GS6) via parameter sharing on S6, alleviating the overfitting problem caused by the computational mode in S6. Experiments on various point cloud tasks validate CloudMamba's ability to achieve state-of-the-art results with significantly less complexity.

CloudMamba: Grouped Selective State Spaces for Point Cloud Analysis

TL;DR

CloudMamba addresses three core challenges in Mamba-based point cloud analysis: imperfect serialization, limited high-level geometric perception, and overfitting from per-dimension parameters in S6. It introduces sequence expanding and sequence merging to build axis-aligned causal sequences, a chainedMamba to enhance global geometric perception, and GS6 to share parameters across dimensions, achieving state-of-the-art results on ModelNet40, ScanObjectNN, ShapeNet, and S3DIS with linear complexity . The approach combines a hexagonally oriented Mamba block with an encoder–decoder and efficient downsampling/upsampling, validated by extensive ablations. These advances enable robust, scalable point cloud understanding with improved geometry perception and computational efficiency, and point to future directions in self-supervised pre-training and structure-preserving serialization.

Abstract

Due to the long-range modeling ability and linear complexity property, Mamba has attracted considerable attention in point cloud analysis. Despite some interesting progress, related work still suffers from imperfect point cloud serialization, insufficient high-level geometric perception, and overfitting of the selective state space model (S6) at the core of Mamba. To this end, we resort to an SSM-based point cloud network termed CloudMamba to address the above challenges. Specifically, we propose sequence expanding and sequence merging, where the former serializes points along each axis separately and the latter serves to fuse the corresponding higher-order features causally inferred from different sequences, enabling unordered point sets to adapt more stably to the causal nature of Mamba without parameters. Meanwhile, we design chainedMamba that chains the forward and backward processes in the parallel bidirectional Mamba, capturing high-level geometric information during scanning. In addition, we propose a grouped selective state space model (GS6) via parameter sharing on S6, alleviating the overfitting problem caused by the computational mode in S6. Experiments on various point cloud tasks validate CloudMamba's ability to achieve state-of-the-art results with significantly less complexity.

Paper Structure

This paper contains 22 sections, 17 equations, 16 figures, 12 tables.

Figures (16)

  • Figure 1: Inference process of the bidirectional Mamba with different structures, where the blue equations denote the inference processes of both bidirectional Mamba for the point b, respectively. In the backward inference of the chained structure, the previous points perceive the high-level structural semantics that are inferred from the forward Mamba.
  • Figure 2: Computational modes of S6 and GS6, where GS6's grouping rate is 3. A multi-dimensional sequence $\boldsymbol{I} \in {\mathbb{R}^{7 \times 6}}$ is used as an example, with the subscript denoting the parameter to be used for a dimension or a set of dimensions.
  • Figure 3: Pipeline of our proposed network. The Flip layer in the chainedMamba indicates a flip operation on the sequence for global modelling. S6 in Mamba is replaced by GS6. Since each point sequence is processed with forward and backward directions, there are hexa orientations for causal modeling.
  • Figure 4: Illustration of the global receptive fields of the attention mechanism and hexa-orientation Mamba block, with the point in the red box as an example.
  • Figure 5: Comparison of SSM-based networks on ModelNet40 dataset, where a larger circle means more parameters.
  • ...and 11 more figures