Table of Contents
Fetching ...

Can we Trust Unreliable Voxels? Exploring 3D Semantic Occupancy Prediction under Label Noise

Wenxin Li, Kunyu Peng, Di Wen, Junwei Zheng, Jiale Wei, Mengfei Duan, Yuheng Zhang, Rui Fan, Kailun Yang

TL;DR

This work establishes OccNL, the first benchmark dedicated to 3D occupancy under occupancy-asymmetric and dynamic trailing noise, and proposes DPR-Occ, a principled label noise-robust framework that constructs reliable supervision through dual-source partial label reasoning.

Abstract

3D semantic occupancy prediction is a cornerstone of robotic perception, yet real-world voxel annotations are inherently corrupted by structural artifacts and dynamic trailing effects. This raises a critical but underexplored question: can autonomous systems safely rely on such unreliable occupancy supervision? To systematically investigate this issue, we establish OccNL, the first benchmark dedicated to 3D occupancy under occupancy-asymmetric and dynamic trailing noise. Our analysis reveals a fundamental domain gap: state-of-the-art 2D label noise learning strategies collapse catastrophically in sparse 3D voxel spaces, exposing a critical vulnerability in existing paradigms. To address this challenge, we propose DPR-Occ, a principled label noise-robust framework that constructs reliable supervision through dual-source partial label reasoning. By synergizing temporal model memory with representation-level structural affinity, DPR-Occ dynamically expands and prunes candidate label sets to preserve true semantics while suppressing noise propagation. Extensive experiments on SemanticKITTI demonstrate that DPR-Occ prevents geometric and semantic collapse under extreme corruption. Notably, even at 90% label noise, our method achieves significant performance gains (up to 2.57% mIoU and 13.91% IoU) over existing label noise learning baselines adapted to the 3D occupancy prediction task. By bridging label noise learning and 3D perception, OccNL and DPR-Occ provide a reliable foundation for safety-critical robotic perception in dynamic environments. The benchmark and source code will be made publicly available at https://github.com/mylwx/OccNL.

Can we Trust Unreliable Voxels? Exploring 3D Semantic Occupancy Prediction under Label Noise

TL;DR

This work establishes OccNL, the first benchmark dedicated to 3D occupancy under occupancy-asymmetric and dynamic trailing noise, and proposes DPR-Occ, a principled label noise-robust framework that constructs reliable supervision through dual-source partial label reasoning.

Abstract

3D semantic occupancy prediction is a cornerstone of robotic perception, yet real-world voxel annotations are inherently corrupted by structural artifacts and dynamic trailing effects. This raises a critical but underexplored question: can autonomous systems safely rely on such unreliable occupancy supervision? To systematically investigate this issue, we establish OccNL, the first benchmark dedicated to 3D occupancy under occupancy-asymmetric and dynamic trailing noise. Our analysis reveals a fundamental domain gap: state-of-the-art 2D label noise learning strategies collapse catastrophically in sparse 3D voxel spaces, exposing a critical vulnerability in existing paradigms. To address this challenge, we propose DPR-Occ, a principled label noise-robust framework that constructs reliable supervision through dual-source partial label reasoning. By synergizing temporal model memory with representation-level structural affinity, DPR-Occ dynamically expands and prunes candidate label sets to preserve true semantics while suppressing noise propagation. Extensive experiments on SemanticKITTI demonstrate that DPR-Occ prevents geometric and semantic collapse under extreme corruption. Notably, even at 90% label noise, our method achieves significant performance gains (up to 2.57% mIoU and 13.91% IoU) over existing label noise learning baselines adapted to the 3D occupancy prediction task. By bridging label noise learning and 3D perception, OccNL and DPR-Occ provide a reliable foundation for safety-critical robotic perception in dynamic environments. The benchmark and source code will be made publicly available at https://github.com/mylwx/OccNL.
Paper Structure (23 sections, 11 equations, 3 figures, 7 tables, 1 algorithm)

This paper contains 23 sections, 11 equations, 3 figures, 7 tables, 1 algorithm.

Figures (3)

  • Figure 1: Semantic distribution evolution under voxel-level category-flipping noise. Due to orders-of-magnitude differences in voxel counts, we employ a logarithmic scale for visualization (empty voxels omitted as they remain constant at $10^{-3}\eta$ flip rate). Increased noise drives the distribution toward uniformity: dominant classes (e.g., vegetation, road) are suppressed, while rare classes (e.g., motorcyclist, person) are artificially inflated, culminating in nearly identical voxel frequencies at $90\%$ noise.
  • Figure 2: The overall framework of our proposed DPR-Occ. Warm-up Stage: The model captures clean patterns via standard training on noisy labels $\tilde{Y}$. Robust Stage: Guided by dynamic-$K$ scheduling, we construct dual-source partial label sets by fusing Top-$K$ predictions from the EMA teacher and feature-prototype similarities. The network is then optimized using Partial Label Learning (PLL) and Negative Learning (NL), with EMA-guided Self-Not-True Distillation (SNTD) further suppressing noise for robust learning.
  • Figure 3: Qualitative results under $90\%$ asymmetric noise on OccNL benchmark. Compared to collapsing baselines, DPR-Occ preserves structural integrity and reliable semantics. The final column shows a failure case (yellow box), where DPR-Occ still reconstructs basic road and vegetation geometry despite semantic misclassification.