Efficiently Expanding Receptive Fields: Local Split Attention and Parallel Aggregation for Enhanced Large-scale Point Cloud Semantic Segmentation

Haodong Wang; Chongyu Wang; Yinghui Quan; Di Wang

Efficiently Expanding Receptive Fields: Local Split Attention and Parallel Aggregation for Enhanced Large-scale Point Cloud Semantic Segmentation

Haodong Wang, Chongyu Wang, Yinghui Quan, Di Wang

TL;DR

A novel framework, designated as LSNet, is put forth, for large-scale point cloud semantic segmentation with superior performance compared to state-of-the-art semantic segmentation networks on three benchmark datasets, including S3DIS, Toronto3D, and SensatUrban.

Abstract

Expanding the receptive field in a deep learning model for large-scale 3D point cloud segmentation is an effective technique for capturing rich contextual information, which consequently enhances the network's ability to learn meaningful features. However, this often leads to increased computational complexity and risk of overfitting, challenging the efficiency and effectiveness of the learning paradigm. To address these limitations, we propose the Local Split Attention Pooling (LSAP) mechanism to effectively expand the receptive field through a series of local split operations, thus facilitating the acquisition of broader contextual knowledge. Concurrently, it optimizes the computational workload associated with attention-pooling layers to ensure a more streamlined processing workflow. Based on LSAP, a Parallel Aggregation Enhancement (PAE) module is introduced to enable parallel processing of data using both 2D and 3D neighboring information to further enhance contextual representations within the network. In light of the aforementioned designs, we put forth a novel framework, designated as LSNet, for large-scale point cloud semantic segmentation. Extensive evaluations demonstrated the efficacy of seamlessly integrating the proposed PAE module into existing frameworks, yielding significant improvements in mean intersection over union (mIoU) metrics, with a notable increase of up to 11%. Furthermore, LSNet demonstrated superior performance compared to state-of-the-art semantic segmentation networks on three benchmark datasets, including S3DIS, Toronto3D, and SensatUrban. It is noteworthy that our method achieved a substantial speedup of approximately 38.8% compared to those employing similar-sized receptive fields, which serves to highlight both its computational efficiency and practical utility in real-world large-scale scenes.

Efficiently Expanding Receptive Fields: Local Split Attention and Parallel Aggregation for Enhanced Large-scale Point Cloud Semantic Segmentation

TL;DR

Abstract

Paper Structure (31 sections, 9 equations, 10 figures, 6 tables, 1 algorithm)

This paper contains 31 sections, 9 equations, 10 figures, 6 tables, 1 algorithm.

Introduction
Related Work
Projection-based Methods
Voxel-based Methods
Point-based Methods
Large-scale Point Cloud Segmentation
Methodology
Network Framework
Local Split Attention Pooling Mechanism
Parallel Aggregation Enhancement Module
Feature Max Aggregated Module
Experiments and Results
Experimental Details
Experimental Datasets
S3DIS Dataset
...and 16 more sections

Figures (10)

Figure 1: The overall architecture of the proposed LSNet.
Figure 2: Key components of the proposed LSNet. The yellow panel displays the parallel aggregation enhancement module, which is used as the main framework for the encoding layer. The pink panel diagram shows the feature max aggregate module, which is responsible for feature aggregation between point features at various scales. The gray panel shows the details of the local split attention pooling mechanism.
Figure 3: Upsampling and max pooling components. Red point: low-resolution layers features, gray point: high-resolution layers features, US: upsampling, Max: max pooling, KNN: K-Nearest Neighbor
Figure 4: Example view of D3DIS dataset. Each semantic category is colored randomly.
Figure 5: Example view of Toronto3D dataset. Each semantic category is colored randomly.
...and 5 more figures

Efficiently Expanding Receptive Fields: Local Split Attention and Parallel Aggregation for Enhanced Large-scale Point Cloud Semantic Segmentation

TL;DR

Abstract

Efficiently Expanding Receptive Fields: Local Split Attention and Parallel Aggregation for Enhanced Large-scale Point Cloud Semantic Segmentation

Authors

TL;DR

Abstract

Table of Contents

Figures (10)