Table of Contents
Fetching ...

FRACTAL: An Ultra-Large-Scale Aerial Lidar Dataset for 3D Semantic Segmentation of Diverse Landscapes

Charles Gaydon, Michel Daab, Floryne Roche

TL;DR

FRACTAL introduces an ultra-large-scale aerial Lidar dataset for 3D semantic segmentation across diverse landscapes. It leverages the Lidar HD archive, colorizes ALS with VHR imagery, and employs a targeted-completion sampling scheme to assemble $100{,}000$ patches over $250\ \mathrm{km^2}$ from five regions, with seven semantic classes and reduced spatial autocorrelation. A baseline RandLa-Net model demonstrates strong performance ($\text{mIoU} = 77.5\%$, $\text{OA} = 96.1\%$), validating FRACTAL as a versatile benchmarking resource for large-scale land monitoring. The dataset, tools, and open access to code and data aim to catalyze advances in deep learning for airborne Lidar by enabling robust cross-domain evaluation and reproducible experiments.

Abstract

Mapping agencies are increasingly adopting Aerial Lidar Scanning (ALS) as a new tool to map buildings and other above-ground structures. Processing ALS data at scale requires efficient point classification methods that perform well over highly diverse territories. Large annotated Lidar datasets are needed to evaluate these classification methods, however, current Lidar benchmarks have restricted scope and often cover a single urban area. To bridge this data gap, we introduce the FRench ALS Clouds from TArgeted Landscapes (FRACTAL) dataset: an ultra-large-scale aerial Lidar dataset made of 100,000 dense point clouds with high quality labels for 7 semantic classes and spanning 250 km$^2$. FRACTAL achieves high spatial and semantic diversity by explicitly sampling rare classes and challenging landscapes from five different regions of France. We describe the data collection, annotation, and curation process of the dataset. We provide baseline semantic segmentation results using a state of the art 3D point cloud classification model. FRACTAL aims to support the development of 3D deep learning approaches for large-scale land monitoring.

FRACTAL: An Ultra-Large-Scale Aerial Lidar Dataset for 3D Semantic Segmentation of Diverse Landscapes

TL;DR

FRACTAL introduces an ultra-large-scale aerial Lidar dataset for 3D semantic segmentation across diverse landscapes. It leverages the Lidar HD archive, colorizes ALS with VHR imagery, and employs a targeted-completion sampling scheme to assemble patches over from five regions, with seven semantic classes and reduced spatial autocorrelation. A baseline RandLa-Net model demonstrates strong performance (, ), validating FRACTAL as a versatile benchmarking resource for large-scale land monitoring. The dataset, tools, and open access to code and data aim to catalyze advances in deep learning for airborne Lidar by enabling robust cross-domain evaluation and reproducible experiments.

Abstract

Mapping agencies are increasingly adopting Aerial Lidar Scanning (ALS) as a new tool to map buildings and other above-ground structures. Processing ALS data at scale requires efficient point classification methods that perform well over highly diverse territories. Large annotated Lidar datasets are needed to evaluate these classification methods, however, current Lidar benchmarks have restricted scope and often cover a single urban area. To bridge this data gap, we introduce the FRench ALS Clouds from TArgeted Landscapes (FRACTAL) dataset: an ultra-large-scale aerial Lidar dataset made of 100,000 dense point clouds with high quality labels for 7 semantic classes and spanning 250 km. FRACTAL achieves high spatial and semantic diversity by explicitly sampling rare classes and challenging landscapes from five different regions of France. We describe the data collection, annotation, and curation process of the dataset. We provide baseline semantic segmentation results using a state of the art 3D point cloud classification model. FRACTAL aims to support the development of 3D deep learning approaches for large-scale land monitoring.
Paper Structure (33 sections, 9 figures, 6 tables)

This paper contains 33 sections, 9 figures, 6 tables.

Figures (9)

  • Figure 1: Random scenes from our FRACTAL dataset. Each point is colored according to its semantic class; ground (orange), vegetation (greens), building (red), water (cyan), bridge (yellow), permanent structure (purple), other (white).
  • Figure 2: The five spatial domains composing the 17,440 km² of the area of interest. The 1,049 km² reserved to sample the test set are highlighted in red.
  • Figure 3: Enlarged view of two Lidar HD tiles and their sampled patches, most of which contain scenes of particular interest.
  • Figure 4: Proportional change of proportion of scene types in FRACTAL after sampling.
  • Figure 5: Data cataloguing boils down to a) listing and b) describing all Lidar patches, prior to sampling.
  • ...and 4 more figures