Rethinking Attention Module Design for Point Cloud Analysis
Chengzhi Wu, Kaige Wang, Zeyun Zhong, Hao Fu, Junwei Zheng, Jiaming Zhang, Julius Pfrommer, Jürgen Beyerer
TL;DR
This paper tackles the challenge of unfair comparisons among attention modules for point cloud analysis by proposing a unified base framework and systematically evaluating both global-based and local-based attention designs across neighbor selection, local feature aggregation, attention scoring, and position encoding. It demonstrates that there is no single universally optimal design, and provides task-specific best-practice guidelines that improve performance on classification and segmentation benchmarks while reducing parameters and FLOPs. By applying these tailored modules to ModelNet40/ScanObjectNN and ShapeNetPart, the work yields competitive results with enhanced efficiency. The study offers practical insights for designing future point-cloud networks and clarifies how design choices interact with downstream tasks.
Abstract
In recent years, there have been significant advancements in applying attention mechanisms to point cloud analysis. However, attention module variants featured in various research papers often operate under diverse settings and tasks, incorporating potential training strategies. This heterogeneity poses challenges in establishing a fair comparison among these attention module variants. In this paper, we address this issue by rethinking and exploring attention module design within a consistent base framework and settings. Both global-based and local-based attention methods are studied, with a focus on the selection basis and scales of neighbors for local-based attention. Different combinations of aggregated local features and computation methods for attention scores are evaluated, ranging from the initial addition/concatenation-based approach to the widely adopted dot product-based method and the recently proposed vector attention technique. Various position encoding methods are also investigated. Our extensive experimental analysis reveals that there is no universally optimal design across diverse point cloud tasks. Instead, drawing from best practices, we propose tailored attention modules for specific tasks, leading to superior performance on point cloud classification and segmentation benchmarks.
