Table of Contents
Fetching ...

Point Mamba: A Novel Point Cloud Backbone Based on State Space Model with Octree-Based Ordering Strategy

Jiuming Liu, Ruiji Yu, Yian Wang, Yu Zheng, Tianchen Deng, Weicai Ye, Hesheng Wang

TL;DR

This work introduces Point Mamba, a novel point cloud backbone based on the state space model (SSM) that addresses the disorder of raw 3D points through an octree-based, z-order ordering to establish causality. The architecture embeds features and processes them with sequential Point Mamba Blocks that combine forward and backward selective scanning, achieving linear complexity and strong global modeling capabilities. Empirical results on ModelNet40 and ScanNet show competitive or superior performance compared to transformer-based backbones while reducing parameters and maintaining efficiency. The approach demonstrates the potential of SSM as a general backbone for point cloud understanding and opens avenues for efficient large-scale 3D processing.

Abstract

Recently, state space model (SSM) has gained great attention due to its promising performance, linear complexity, and long sequence modeling ability in both language and image domains. However, it is non-trivial to extend SSM to the point cloud field, because of the causality requirement of SSM and the disorder and irregularity nature of point clouds. In this paper, we propose a novel SSM-based point cloud processing backbone, named Point Mamba, with a causality-aware ordering mechanism. To construct the causal dependency relationship, we design an octree-based ordering strategy on raw irregular points, globally sorting points in a z-order sequence and also retaining their spatial proximity. Our method achieves state-of-the-art performance compared with transformer-based counterparts, with 93.4% accuracy and 75.7 mIOU respectively on the ModelNet40 classification dataset and ScanNet semantic segmentation dataset. Furthermore, our Point Mamba has linear complexity, which is more efficient than transformer-based methods. Our method demonstrates the great potential that SSM can serve as a generic backbone in point cloud understanding. Codes are released at https://github.com/IRMVLab/Point-Mamba.

Point Mamba: A Novel Point Cloud Backbone Based on State Space Model with Octree-Based Ordering Strategy

TL;DR

This work introduces Point Mamba, a novel point cloud backbone based on the state space model (SSM) that addresses the disorder of raw 3D points through an octree-based, z-order ordering to establish causality. The architecture embeds features and processes them with sequential Point Mamba Blocks that combine forward and backward selective scanning, achieving linear complexity and strong global modeling capabilities. Empirical results on ModelNet40 and ScanNet show competitive or superior performance compared to transformer-based backbones while reducing parameters and maintaining efficiency. The approach demonstrates the potential of SSM as a general backbone for point cloud understanding and opens avenues for efficient large-scale 3D processing.

Abstract

Recently, state space model (SSM) has gained great attention due to its promising performance, linear complexity, and long sequence modeling ability in both language and image domains. However, it is non-trivial to extend SSM to the point cloud field, because of the causality requirement of SSM and the disorder and irregularity nature of point clouds. In this paper, we propose a novel SSM-based point cloud processing backbone, named Point Mamba, with a causality-aware ordering mechanism. To construct the causal dependency relationship, we design an octree-based ordering strategy on raw irregular points, globally sorting points in a z-order sequence and also retaining their spatial proximity. Our method achieves state-of-the-art performance compared with transformer-based counterparts, with 93.4% accuracy and 75.7 mIOU respectively on the ModelNet40 classification dataset and ScanNet semantic segmentation dataset. Furthermore, our Point Mamba has linear complexity, which is more efficient than transformer-based methods. Our method demonstrates the great potential that SSM can serve as a generic backbone in point cloud understanding. Codes are released at https://github.com/IRMVLab/Point-Mamba.
Paper Structure (22 sections, 7 equations, 4 figures, 7 tables, 1 algorithm)

This paper contains 22 sections, 7 equations, 4 figures, 7 tables, 1 algorithm.

Figures (4)

  • Figure 1: The overall architecture of Point Mamba, which contains an octree establishment layer, a feature embedding layer, and a sequence of Point Mamba blocks and downsampling layers. N is the number of input points. C is the channel dimension of the point features. $N_{i}$ denotes the number of Point Mamba Blocks in the $i$-th stage. The structure of one Point Mamba Block includes the core SSM module and the bidirectional selective scanning mechanism 49.
  • Figure 2: The 3D overall z-order curve and a 2D illustration of octree structure and its corresponding curve. (a) is the overall z-order curve in 3D space, (b) is the 2D illustration. (c) is the octree structure, and (d) is the corresponding z-order curve of the octree structure in the depth of 3.
  • Figure 3: The segmentation results on the ScanNet dataset. The first column is the ground truth, the second column is the prediction of OctFormer, and the third column is the prediction of Point Mamba. The global modeling ability of our Point Mamba improves the recognition and segmentation of consistent semantic objects, like the walls.
  • Figure 4: Fig. (a) shows the GPU memory usage of Point Mamba and PCT with a varying sequence length. Point Mamba retains linear memory usage and has lower memory usage than PCT. Fig. (b) shows the forward speed and mIoU of Point Mamba, voxel-based CNNs, and transformer backbones on ScanNet. Point Mamba achieves a competitive mIoU with a faster forward speed.