Table of Contents
Fetching ...

PMA: Towards Parameter-Efficient Point Cloud Understanding via Point Mamba Adapter

Yaohua Zha, Yanzi Wang, Hang Guo, Jinpeng Wang, Tao Dai, Bin Chen, Zhihao Ouyang, Xue Yuerong, Ke Chen, Shu-Tao Xia

TL;DR

The paper tackles the underutilization of intermediate representations in pre-trained point cloud models by introducing Point Mamba Adapter (PMA), a parameter-efficient fine-tuning framework that fuses features from all backbone layers using a Mamba-based state-space fusion. A geometry-constrained gate prompt generator (G2PG) learns spatially aware ordering and gating to address 3D isotropy, enabling end-to-end adaptation with only a small set of trainable components. PMA achieves strong performance across object classification, part segmentation, and few-shot learning while dramatically reducing trainable parameters, demonstrating practical advantages for deploying large pre-trained 3D models on resource-limited devices. The approach highlights the value of leveraging intermediate multi-layer semantics and geometry-aware sequencing to enhance fine-grained point cloud understanding.

Abstract

Applying pre-trained models to assist point cloud understanding has recently become a mainstream paradigm in 3D perception. However, existing application strategies are straightforward, utilizing only the final output of the pre-trained model for various task heads. It neglects the rich complementary information in the intermediate layer, thereby failing to fully unlock the potential of pre-trained models. To overcome this limitation, we propose an orthogonal solution: Point Mamba Adapter (PMA), which constructs an ordered feature sequence from all layers of the pre-trained model and leverages Mamba to fuse all complementary semantics, thereby promoting comprehensive point cloud understanding. Constructing this ordered sequence is non-trivial due to the inherent isotropy of 3D space. Therefore, we further propose a geometry-constrained gate prompt generator (G2PG) shared across different layers, which applies shared geometric constraints to the output gates of the Mamba and dynamically optimizes the spatial order, thus enabling more effective integration of multi-layer information. Extensive experiments conducted on challenging point cloud datasets across various tasks demonstrate that our PMA elevates the capability for point cloud understanding to a new level by fusing diverse complementary intermediate features. Code is available at https://github.com/zyh16143998882/PMA.

PMA: Towards Parameter-Efficient Point Cloud Understanding via Point Mamba Adapter

TL;DR

The paper tackles the underutilization of intermediate representations in pre-trained point cloud models by introducing Point Mamba Adapter (PMA), a parameter-efficient fine-tuning framework that fuses features from all backbone layers using a Mamba-based state-space fusion. A geometry-constrained gate prompt generator (G2PG) learns spatially aware ordering and gating to address 3D isotropy, enabling end-to-end adaptation with only a small set of trainable components. PMA achieves strong performance across object classification, part segmentation, and few-shot learning while dramatically reducing trainable parameters, demonstrating practical advantages for deploying large pre-trained 3D models on resource-limited devices. The approach highlights the value of leveraging intermediate multi-layer semantics and geometry-aware sequencing to enhance fine-grained point cloud understanding.

Abstract

Applying pre-trained models to assist point cloud understanding has recently become a mainstream paradigm in 3D perception. However, existing application strategies are straightforward, utilizing only the final output of the pre-trained model for various task heads. It neglects the rich complementary information in the intermediate layer, thereby failing to fully unlock the potential of pre-trained models. To overcome this limitation, we propose an orthogonal solution: Point Mamba Adapter (PMA), which constructs an ordered feature sequence from all layers of the pre-trained model and leverages Mamba to fuse all complementary semantics, thereby promoting comprehensive point cloud understanding. Constructing this ordered sequence is non-trivial due to the inherent isotropy of 3D space. Therefore, we further propose a geometry-constrained gate prompt generator (G2PG) shared across different layers, which applies shared geometric constraints to the output gates of the Mamba and dynamically optimizes the spatial order, thus enabling more effective integration of multi-layer information. Extensive experiments conducted on challenging point cloud datasets across various tasks demonstrate that our PMA elevates the capability for point cloud understanding to a new level by fusing diverse complementary intermediate features. Code is available at https://github.com/zyh16143998882/PMA.

Paper Structure

This paper contains 24 sections, 5 equations, 3 figures, 7 tables.

Figures (3)

  • Figure 1: The impact of freezing different numbers of pre-trained model layers on the output features and their effect on downstream tasks. The performance is evaluated using the Point-MAE pre-trained model on classification and part segmentation tasks.
  • Figure 2: The parameter-efficient fine-tuning pipeline based on our Point Mamba Adapter. It consists of three main components: the pre-trained model, the shared Geometry-constrained Gate Prompt Generator (G2PG), and the Point Mamba Adapter. During fine-tuning, only the CLS token, the G2PG, the Mamba Adapter, and the downstream task head are updated.
  • Figure 3: Details of our Geometry-constrained Gate Prompt Generator (G2PG).