Table of Contents
Fetching ...

Unsupervised deep learning for semantic segmentation of multispectral LiDAR forest point clouds

Lassi Ruoppa, Oona Oinonen, Josef Taher, Matti Lehtomäki, Narges Takhtkeshha, Antero Kukko, Harri Kaartinen, Juha Hyyppä

TL;DR

This work tackles leaf--wood segmentation in high-density multispectral ALS forest point clouds without relying on labeled data. It introduces GrowSP-ForMS, an unsupervised deep learning model adapted from GrowSP that leverages covariance-based geometric descriptors, a graph-based superpoint constructor, adaptive clustering weights, and oversegmentation to improve semantic leaf–wood separation on multispectral data. Across extensive boreal forest data, GrowSP-ForMS achieves a mean IoU of $69.6\%$ and mean accuracy of $84.3\%$, outperforming unsupervised baselines and approaching early supervised methods, with multispectral information providing an additional $\sim5.6$ percentage points in mIoU. The results establish a new state-of-the-art for unsupervised leaf--wood segmentation on MS ALS data and outline concrete ablations and future directions (backbone improvements, unsupervised contrastive pretraining, and extension to other LiDAR modalities).

Abstract

Point clouds captured with laser scanning systems from forest environments can be utilized in a wide variety of applications within forestry and plant ecology, such as the estimation of tree stem attributes, leaf angle distribution, and above-ground biomass. However, effectively utilizing the data in such tasks requires the semantic segmentation of the data into wood and foliage points, also known as leaf-wood separation. The traditional approach to leaf-wood separation has been geometry- and radiometry-based unsupervised algorithms, which tend to perform poorly on data captured with airborne laser scanning (ALS) systems, even with a high point density. While recent machine and deep learning approaches achieve great results even on sparse point clouds, they require manually labeled training data, which is often extremely laborious to produce. Multispectral (MS) information has been demonstrated to have potential for improving the accuracy of leaf-wood separation, but quantitative assessment of its effects has been lacking. This study proposes a fully unsupervised deep learning method, GrowSP-ForMS, which is specifically designed for leaf-wood separation of high-density MS ALS point clouds and based on the GrowSP architecture. GrowSP-ForMS achieved a mean accuracy of 84.3% and a mean intersection over union (mIoU) of 69.6% on our MS test set, outperforming the unsupervised reference methods by a significant margin. When compared to supervised deep learning methods, our model performed similarly to the slightly older PointNet architecture but was outclassed by more recent approaches. Finally, two ablation studies were conducted, which demonstrated that our proposed changes increased the test set mIoU of GrowSP-ForMS by 29.4 percentage points (pp) in comparison to the original GrowSP model and that utilizing MS data improved the mIoU by 5.6 pp from the monospectral case.

Unsupervised deep learning for semantic segmentation of multispectral LiDAR forest point clouds

TL;DR

This work tackles leaf--wood segmentation in high-density multispectral ALS forest point clouds without relying on labeled data. It introduces GrowSP-ForMS, an unsupervised deep learning model adapted from GrowSP that leverages covariance-based geometric descriptors, a graph-based superpoint constructor, adaptive clustering weights, and oversegmentation to improve semantic leaf–wood separation on multispectral data. Across extensive boreal forest data, GrowSP-ForMS achieves a mean IoU of and mean accuracy of , outperforming unsupervised baselines and approaching early supervised methods, with multispectral information providing an additional percentage points in mIoU. The results establish a new state-of-the-art for unsupervised leaf--wood segmentation on MS ALS data and outline concrete ablations and future directions (backbone improvements, unsupervised contrastive pretraining, and extension to other LiDAR modalities).

Abstract

Point clouds captured with laser scanning systems from forest environments can be utilized in a wide variety of applications within forestry and plant ecology, such as the estimation of tree stem attributes, leaf angle distribution, and above-ground biomass. However, effectively utilizing the data in such tasks requires the semantic segmentation of the data into wood and foliage points, also known as leaf-wood separation. The traditional approach to leaf-wood separation has been geometry- and radiometry-based unsupervised algorithms, which tend to perform poorly on data captured with airborne laser scanning (ALS) systems, even with a high point density. While recent machine and deep learning approaches achieve great results even on sparse point clouds, they require manually labeled training data, which is often extremely laborious to produce. Multispectral (MS) information has been demonstrated to have potential for improving the accuracy of leaf-wood separation, but quantitative assessment of its effects has been lacking. This study proposes a fully unsupervised deep learning method, GrowSP-ForMS, which is specifically designed for leaf-wood separation of high-density MS ALS point clouds and based on the GrowSP architecture. GrowSP-ForMS achieved a mean accuracy of 84.3% and a mean intersection over union (mIoU) of 69.6% on our MS test set, outperforming the unsupervised reference methods by a significant margin. When compared to supervised deep learning methods, our model performed similarly to the slightly older PointNet architecture but was outclassed by more recent approaches. Finally, two ablation studies were conducted, which demonstrated that our proposed changes increased the test set mIoU of GrowSP-ForMS by 29.4 percentage points (pp) in comparison to the original GrowSP model and that utilizing MS data improved the mIoU by 5.6 pp from the monospectral case.

Paper Structure

This paper contains 45 sections, 16 equations, 10 figures, 10 tables, 1 algorithm.

Figures (10)

  • Figure 1: Visualization of the data preprocessing steps for plot #2. The points in (b) and (c) are colored based on the $z$-coordinates, while (a) and (d) use pseudo colors generated from scaled reflectance values of scanners 1, 2, and 3 for the red, green, and blue channels respectively. (a) Original multispectral point cloud of the forest plot. (b) Normalized point cloud from which the ground points and elevation have been removed. (c) Outlines of the cylindrical neighborhoods of radius $r_c=4.2$ m used as training data superimposed on the normalized point cloud in the $xy$-plane. (d) Example of a cylindrical neighborhood that has been cut out from the normalized point cloud.
  • Figure 2: Visualization of a section from the manually annotated point cloud of plot #2. (a) Instance segmentation of trees, where each individual tree has been colored with a distinct color. Points that are not a part of any segmented tree instance have been colored gray. (b) Semantic annotations of individual points. The colors red and green represent wood and foliage points respectively. (c) Section of an individual tree showing the finer details of the semantic annotation.
  • Figure 3: Visualization of the split between training and testing data for our manually labeled data set. (a) Train-test split for plot #1. (b) Train-test split for plot #2.
  • Figure 4: Overview of the generic GrowSP model architecture. The learning framework consists of three main modules: a superpoint constructor, a neural network feature extractor, and a semantic primitive clustering module. In the figure, $\bm{\mathcal{P}}_i$ denotes the $i$th 3D point cloud in a given data set. The figure has been reconstructed based on zhang2023growsp.
  • Figure 5: Comparison between geometric features and point feature histogram descriptors for one cylinder from plot #1. The geometric features have been computed for the entire plot prior to dividing it into smaller cylinders. (a) Linearity. (b) Planarity. (c) Sphericity. (d) Verticality. (e) First principal component. (d) The first point feature histogram descriptor from 10-dimensional PFHs. Note that the values have been scaled to the range [0,1] for visualization purposes.
  • ...and 5 more figures