Table of Contents
Fetching ...

M2S-RoAD: Multi-Modal Semantic Segmentation for Road Damage Using Camera and LiDAR Data

Tzu-Yun Tseng, Hongyu Lyu, Josephine Li, Julie Stephany Berrio, Mao Shan, Stewart Worrall

TL;DR

The paper addresses the gap in rural road-damage understanding by introducing M2S-RoAD, a first multi-modal dataset for semantic segmentation that pairs camera imagery with LiDAR point clouds to label 9 road-damage types. It details data collection in rural New South Wales, calibration between sensors, image annotations, and a method for transferring image labels to 3D points, along with baseline semantic segmentation experiments using five state-of-the-art techniques. Key findings show transformer-based models outperform CNNs and that dataset imbalance and adverse weather notably affect performance, highlighting the need for more balanced data and extended 3D evaluation. The dataset offers a practical platform for robust, multi-modal road damage detection in rural environments and is to be released upon acceptance to spur further research.

Abstract

Road damage can create safety and comfort challenges for both human drivers and autonomous vehicles (AVs). This damage is particularly prevalent in rural areas due to less frequent surveying and maintenance of roads. Automated detection of pavement deterioration can be used as an input to AVs and driver assistance systems to improve road safety. Current research in this field has predominantly focused on urban environments driven largely by public datasets, while rural areas have received significantly less attention. This paper introduces M2S-RoAD, a dataset for the semantic segmentation of different classes of road damage. M2S-RoAD was collected in various towns across New South Wales, Australia, and labelled for semantic segmentation to identify nine distinct types of road damage. This dataset will be released upon the acceptance of the paper.

M2S-RoAD: Multi-Modal Semantic Segmentation for Road Damage Using Camera and LiDAR Data

TL;DR

The paper addresses the gap in rural road-damage understanding by introducing M2S-RoAD, a first multi-modal dataset for semantic segmentation that pairs camera imagery with LiDAR point clouds to label 9 road-damage types. It details data collection in rural New South Wales, calibration between sensors, image annotations, and a method for transferring image labels to 3D points, along with baseline semantic segmentation experiments using five state-of-the-art techniques. Key findings show transformer-based models outperform CNNs and that dataset imbalance and adverse weather notably affect performance, highlighting the need for more balanced data and extended 3D evaluation. The dataset offers a practical platform for robust, multi-modal road damage detection in rural environments and is to be released upon acceptance to spur further research.

Abstract

Road damage can create safety and comfort challenges for both human drivers and autonomous vehicles (AVs). This damage is particularly prevalent in rural areas due to less frequent surveying and maintenance of roads. Automated detection of pavement deterioration can be used as an input to AVs and driver assistance systems to improve road safety. Current research in this field has predominantly focused on urban environments driven largely by public datasets, while rural areas have received significantly less attention. This paper introduces M2S-RoAD, a dataset for the semantic segmentation of different classes of road damage. M2S-RoAD was collected in various towns across New South Wales, Australia, and labelled for semantic segmentation to identify nine distinct types of road damage. This dataset will be released upon the acceptance of the paper.

Paper Structure

This paper contains 20 sections, 2 equations, 6 figures, 2 tables.

Figures (6)

  • Figure 1: Scenes from rural regions with road damage.
  • Figure 2: Sensor configuration of the data collection platform equipped with a 360-degree LiDAR and a front-facing camera.
  • Figure 3: LiDAR point cloud projected onto the camera image using the camera's intrinsic parameters and the extrinsic calibration between the camera and LiDAR.
  • Figure 4: Road surface conditions in various regions of New South Wales under different weather conditions.
  • Figure 5: Label transfer from image to point cloud domain.
  • ...and 1 more figures