Table of Contents
Fetching ...

Density-aware global-local attention network for point cloud segmentation

Chade Li, Pengju Zhang, Jiaming Zhang, Yihong Wu

TL;DR

The paper addresses the challenge of 3D point cloud segmentation in scenes with small objects and imbalanced category sizes. It introduces Density-based Global-Local Attention (DbGLA), which combines density-aware local windows (via DBSCAN density partitioning) with a global inter-area attention path, and adds a category-response loss (CR-Loss) plus a category-presence head to balance learning across categories. Experimental results across indoor, outdoor, and part segmentation benchmarks show consistent gains over baselines, with notable improvements for small or underrepresented categories and validation on real oil-field data. The work offers a scalable approach to robust 3D segmentation by jointly enriching local detail capture and global scene understanding, with potential for extension to very large outdoor environments.

Abstract

3D point cloud segmentation has a wide range of applications in areas such as autonomous driving, augmented reality, virtual reality and digital twins. The point cloud data collected in real scenes often contain small objects and categories with small sample sizes, which are difficult to handle by existing networks. In this regard, we propose a point cloud segmentation network that fuses local attention based on density perception with global attention. The core idea is to increase the effective receptive field of each point while reducing the loss of information about small objects in dense areas. Specifically, we divide different sized windows for local areas with different densities to compute attention within the window. Furthermore, we consider each local area as an independent token for the global attention of the entire input. A category-response loss is also proposed to balance the processing of different categories and sizes of objects. In particular, we set up an additional fully connected layer in the middle of the network for prediction of the presence of object categories, and construct a binary cross-entropy loss to respond to the presence of categories in the scene. In experiments, our method achieves competitive results in semantic segmentation and part segmentation tasks on several publicly available datasets. Experiments on point cloud data obtained from complex real-world scenes filled with tiny objects also validate the strong segmentation capability of our method for small objects as well as small sample categories.

Density-aware global-local attention network for point cloud segmentation

TL;DR

The paper addresses the challenge of 3D point cloud segmentation in scenes with small objects and imbalanced category sizes. It introduces Density-based Global-Local Attention (DbGLA), which combines density-aware local windows (via DBSCAN density partitioning) with a global inter-area attention path, and adds a category-response loss (CR-Loss) plus a category-presence head to balance learning across categories. Experimental results across indoor, outdoor, and part segmentation benchmarks show consistent gains over baselines, with notable improvements for small or underrepresented categories and validation on real oil-field data. The work offers a scalable approach to robust 3D segmentation by jointly enriching local detail capture and global scene understanding, with potential for extension to very large outdoor environments.

Abstract

3D point cloud segmentation has a wide range of applications in areas such as autonomous driving, augmented reality, virtual reality and digital twins. The point cloud data collected in real scenes often contain small objects and categories with small sample sizes, which are difficult to handle by existing networks. In this regard, we propose a point cloud segmentation network that fuses local attention based on density perception with global attention. The core idea is to increase the effective receptive field of each point while reducing the loss of information about small objects in dense areas. Specifically, we divide different sized windows for local areas with different densities to compute attention within the window. Furthermore, we consider each local area as an independent token for the global attention of the entire input. A category-response loss is also proposed to balance the processing of different categories and sizes of objects. In particular, we set up an additional fully connected layer in the middle of the network for prediction of the presence of object categories, and construct a binary cross-entropy loss to respond to the presence of categories in the scene. In experiments, our method achieves competitive results in semantic segmentation and part segmentation tasks on several publicly available datasets. Experiments on point cloud data obtained from complex real-world scenes filled with tiny objects also validate the strong segmentation capability of our method for small objects as well as small sample categories.

Paper Structure

This paper contains 19 sections, 10 equations, 6 figures, 11 tables.

Figures (6)

  • Figure 1: Global attention carried out between local areas and local attention with different window sizes within local areas of different densities.
  • Figure 2: Overview of proposed methods (taking outdoor scenarios as an example).
  • Figure 3: Visualisation diagram of local area density division of point cloud (the visualisation of point cloud file density division in SemanticKITTI 38-semantickitti.
  • Figure 4: A schematic diagram of two connected local attention calculation modules.
  • Figure 5: Qualitative experiment and ablation experiment results of semantic segmentation of outdoor point cloud dataset Semantic3D 22-semantic3d.
  • ...and 1 more figures