Table of Contents
Fetching ...

No Thing, Nothing: Highlighting Safety-Critical Classes for Robust LiDAR Semantic Segmentation in Adverse Weather

Junsung Park, Hwijeong Lee, Inha Kang, Hyunjung Shim

TL;DR

This work tackles the difficulty of reliably segmenting safety-critical things in LiDAR data under adverse weather by identifying semantic-level and local-level feature corruptions as the main bottlenecks. It introduces NTN, a framework that combines Feature Binding (FB) to anchor thing classes to visually similar superclasses and Beam-wise Feature Distillation (BFD) to preserve learning signals within each LiDAR beam under point loss. The approach is architecture-agnostic and yields state-of-the-art results on SemanticKITTI→SemanticSTF and SemanticPOSS→SemanticSTF, with notable improvements for things classes (e.g., gains of up to +4.8 to +7.9 mIoU). These results demonstrate enhanced robustness of safety-critical perception in autonomous driving across diverse adverse-weather scenarios, with practical implications for safer navigation and planning.

Abstract

Existing domain generalization methods for LiDAR semantic segmentation under adverse weather struggle to accurately predict "things" categories compared to "stuff" categories. In typical driving scenes, "things" categories can be dynamic and associated with higher collision risks, making them crucial for safe navigation and planning. Recognizing the importance of "things" categories, we identify their performance drop as a serious bottleneck in existing approaches. We observed that adverse weather induces degradation of semantic-level features and both corruption of local features, leading to a misprediction of "things" as "stuff". To mitigate these corruptions, we suggest our method, NTN - segmeNt Things for No-accident. To address semantic-level feature corruption, we bind each point feature to its superclass, preventing the misprediction of things classes into visually dissimilar categories. Additionally, to enhance robustness against local corruption caused by adverse weather, we define each LiDAR beam as a local region and propose a regularization term that aligns the clean data with its corrupted counterpart in feature space. NTN achieves state-of-the-art performance with a +2.6 mIoU gain on the SemanticKITTI-to-SemanticSTF benchmark and +7.9 mIoU on the SemanticPOSS-to-SemanticSTF benchmark. Notably, NTN achieves a +4.8 and +7.9 mIoU improvement on "things" classes, respectively, highlighting its effectiveness.

No Thing, Nothing: Highlighting Safety-Critical Classes for Robust LiDAR Semantic Segmentation in Adverse Weather

TL;DR

This work tackles the difficulty of reliably segmenting safety-critical things in LiDAR data under adverse weather by identifying semantic-level and local-level feature corruptions as the main bottlenecks. It introduces NTN, a framework that combines Feature Binding (FB) to anchor thing classes to visually similar superclasses and Beam-wise Feature Distillation (BFD) to preserve learning signals within each LiDAR beam under point loss. The approach is architecture-agnostic and yields state-of-the-art results on SemanticKITTI→SemanticSTF and SemanticPOSS→SemanticSTF, with notable improvements for things classes (e.g., gains of up to +4.8 to +7.9 mIoU). These results demonstrate enhanced robustness of safety-critical perception in autonomous driving across diverse adverse-weather scenarios, with practical implications for safer navigation and planning.

Abstract

Existing domain generalization methods for LiDAR semantic segmentation under adverse weather struggle to accurately predict "things" categories compared to "stuff" categories. In typical driving scenes, "things" categories can be dynamic and associated with higher collision risks, making them crucial for safe navigation and planning. Recognizing the importance of "things" categories, we identify their performance drop as a serious bottleneck in existing approaches. We observed that adverse weather induces degradation of semantic-level features and both corruption of local features, leading to a misprediction of "things" as "stuff". To mitigate these corruptions, we suggest our method, NTN - segmeNt Things for No-accident. To address semantic-level feature corruption, we bind each point feature to its superclass, preventing the misprediction of things classes into visually dissimilar categories. Additionally, to enhance robustness against local corruption caused by adverse weather, we define each LiDAR beam as a local region and propose a regularization term that aligns the clean data with its corrupted counterpart in feature space. NTN achieves state-of-the-art performance with a +2.6 mIoU gain on the SemanticKITTI-to-SemanticSTF benchmark and +7.9 mIoU on the SemanticPOSS-to-SemanticSTF benchmark. Notably, NTN achieves a +4.8 and +7.9 mIoU improvement on "things" classes, respectively, highlighting its effectiveness.

Paper Structure

This paper contains 31 sections, 6 equations, 14 figures, 10 tables.

Figures (14)

  • Figure 1: Previous methods struggle to accurately predict the things category in adverse weather. Our approach overcomes this limitation with improved performance for things category.
  • Figure 2: (a) Comparison of things classes in SemanticKITTI behley2019semantickitti (clean) and SemanticSTF xiao20233d (corrupted) datasets. (b) Examples of things classes misclassified as stuff due to weather corruption. The prediction results were obtained with SJ+LPD park2024rethinking.
  • Figure 3: Confusion matrices for (a) superclasses and (b) things and stuff. It shows a significant performance gap between things and stuff; things are often misclassified as stuff, whereas the reverse misclassification is rare.
  • Figure 4: (a) Framework of NTN. NTN builds upon SJ+LPD park2024rethinking and enhances the performance of the LSS model on the things category through Feature Binding (FB) and Beam-based Feature Distillation (BFD). (b) FB stores prototypes obtained from clean data in a memory bank and encourages features derived from augmented data to align with these prototypes. (c) BFD imposes constraints to ensure that features from clean data and those obtained from augmented data match within a manually defined local region, referred to as a beam.
  • Figure 5: Illustration of (a) Point-wise Feature Distillation and (b) Beam-wise Feature Distillation.
  • ...and 9 more figures