Spectral Informed Mamba for Robust Point Cloud Processing
Ali Bahri, Moslem Yazdanpanah, Mehrdad Noori, Sahar Dastani, Milad Cheraghalikhani, David Osowiechi, Gustavo Adolfo Vargas Hakim, Farzad Beizaee, Ismail Ben Ayed, Christian Desrosiers
TL;DR
This work extends state-space models to 3D point clouds by introducing a spectral-informed Mamba framework. It capitalizes on the Laplacian spectrum of a patch-connectivity graph to achieve isometry-invariant token ordering (SAST), a recursive spectral partitioning scheme for accurate point-level segmentation (HLT), and a tour-preserving token placement strategy for Masked Autoencoders (TAR). Empirical results show consistent gains in object classification, segmentation, and few-shot scenarios while maintaining favorable computational efficiency. Overall, the approach offers a robust, geometry-aware alternative to grid-based traversals for point-cloud processing with Mamba networks.
Abstract
State space models have shown significant promise in Natural Language Processing (NLP) and, more recently, computer vision. This paper introduces a new methodology leveraging Mamba and Masked Autoencoder networks for point cloud data in both supervised and self-supervised learning. We propose three key contributions to enhance Mamba's capability in processing complex point cloud structures. First, we exploit the spectrum of a graph Laplacian to capture patch connectivity, defining an isometry-invariant traversal order that is robust to viewpoints and better captures shape manifolds than traditional 3D grid-based traversals. Second, we adapt segmentation via a recursive patch partitioning strategy informed by Laplacian spectral components, allowing finer integration and segment analysis. Third, we address token placement in Masked Autoencoder for Mamba by restoring tokens to their original positions, which preserves essential order and improves learning. Extensive experiments demonstrate the improvements of our approach in classification, segmentation, and few-shot tasks over state-of-the-art baselines.
