Table of Contents
Fetching ...

Exploring Semi-Supervised Learning for Online Mapping

Adam Lilja, Erik Wallin, Junsheng Fu, Lars Hammarstrand

TL;DR

This work adapts the teacher–student semi-supervised learning paradigm to online BEV-based mapping, introducing temporal fusion of teacher pseudo-labels across frames to exploit frame-to-frame consistency. Leveraging limited labeled data and large unlabeled datasets, the approach combines strong augmentations, thresholding, and a multi-frame teacher fusion to substantially improve static-class map predictions while enabling robust generalization to unseen cities. Key findings show up to a 3.5x improvement over label-only training with 10% labeled data, significant gains across multiple online-mapping architectures, and meaningful domain adaptation benefits when incorporating unlabelled target-domain sequences. The study provides a practical SSL blueprint for online mapping that reduces labeling requirements and enhances deployment prospects in diverse urban environments.

Abstract

The ability to generate online maps using only onboard sensory information is crucial for enabling autonomous driving beyond well-mapped areas. Training models for this task -- predicting lane markers, road edges, and pedestrian crossings -- traditionally require extensive labelled data, which is expensive and labour-intensive to obtain. While semi-supervised learning (SSL) has shown promise in other domains, its potential for online mapping remains largely underexplored. In this work, we bridge this gap by demonstrating the effectiveness of SSL methods for online mapping. Furthermore, we introduce a simple yet effective method leveraging the inherent properties of online mapping by fusing the teacher's pseudo-labels from multiple samples, enhancing the reliability of self-supervised training. If 10% of the data has labels, our method to leverage unlabelled data achieves a 3.5x performance boost compared to only using the labelled data. This narrows the gap to a fully supervised model, using all labels, to just 3.5 mIoU. We also show strong generalization to unseen cities. Specifically, in Argoverse 2, when adapting to Pittsburgh, incorporating purely unlabelled target-domain data reduces the performance gap from 5 to 0.5 mIoU. These results highlight the potential of SSL as a powerful tool for solving the online mapping problem, significantly reducing reliance on labelled data.

Exploring Semi-Supervised Learning for Online Mapping

TL;DR

This work adapts the teacher–student semi-supervised learning paradigm to online BEV-based mapping, introducing temporal fusion of teacher pseudo-labels across frames to exploit frame-to-frame consistency. Leveraging limited labeled data and large unlabeled datasets, the approach combines strong augmentations, thresholding, and a multi-frame teacher fusion to substantially improve static-class map predictions while enabling robust generalization to unseen cities. Key findings show up to a 3.5x improvement over label-only training with 10% labeled data, significant gains across multiple online-mapping architectures, and meaningful domain adaptation benefits when incorporating unlabelled target-domain sequences. The study provides a practical SSL blueprint for online mapping that reduces labeling requirements and enhances deployment prospects in diverse urban environments.

Abstract

The ability to generate online maps using only onboard sensory information is crucial for enabling autonomous driving beyond well-mapped areas. Training models for this task -- predicting lane markers, road edges, and pedestrian crossings -- traditionally require extensive labelled data, which is expensive and labour-intensive to obtain. While semi-supervised learning (SSL) has shown promise in other domains, its potential for online mapping remains largely underexplored. In this work, we bridge this gap by demonstrating the effectiveness of SSL methods for online mapping. Furthermore, we introduce a simple yet effective method leveraging the inherent properties of online mapping by fusing the teacher's pseudo-labels from multiple samples, enhancing the reliability of self-supervised training. If 10% of the data has labels, our method to leverage unlabelled data achieves a 3.5x performance boost compared to only using the labelled data. This narrows the gap to a fully supervised model, using all labels, to just 3.5 mIoU. We also show strong generalization to unseen cities. Specifically, in Argoverse 2, when adapting to Pittsburgh, incorporating purely unlabelled target-domain data reduces the performance gap from 5 to 0.5 mIoU. These results highlight the potential of SSL as a powerful tool for solving the online mapping problem, significantly reducing reliance on labelled data.

Paper Structure

This paper contains 18 sections, 1 equation, 9 figures, 15 tables.

Figures (9)

  • Figure 1: Semi-supervised learning framework for online mapping with temporal pseudo-label fusion. Our approach adapts the Teacher-Student paradigm to the online mapping domain while introducing a novel temporal fusion mechanism that aggregates pseudo-labels across multiple frames. We enable label-efficient learning from largely unlabelled datasets and strong generalization to unseen cities.
  • Figure 2: No thresholding, soft MixMatch berthelot2019mixmatch, and hard FixMatchsohn2020fixmatch thresholding utilising $2.5\%$ and $10\%$ of the labels. , mean and std., use neither thresholding nor sharpening.
  • Figure 3: Feature Similarity using Mean Square Error following zhu2024semibevseg and Cosine Similarity following wallin2022doublematch varying $\omega_{\text{feat}}$. refers to no feature similarity while early and late refers to using BEV features before or after BEV processing $f^{\text{dec}}_\theta$ in \ref{['fig:main-arch']}.
  • Figure 4: Teacher Fusion across varying maximum ranges from the current sample. Fusing the probabilities yields consistent improvements over the without any fusion.
  • Figure 5: Multiple SOTA online mapping methods benefit from the SSL-scheme compared to the supervised training.
  • ...and 4 more figures