Table of Contents
Fetching ...

The BRAVO Semantic Segmentation Challenge Results in UNCV2024

Tuan-Hung Vu, Eduardo Valle, Andrei Bursuc, Tommie Kerssies, Daan de Geus, Gijs Dubbelman, Long Qian, Bingke Zhu, Yingying Chen, Ming Tang, Jinqiao Wang, Tomáš Vojíř, Jan Šochman, Jiří Matas, Michael Smith, Frank Ferrie, Shamik Basu, Christos Sakaridis, Luc Van Gool

TL;DR

The unified BRAVO challenge to benchmark the reliability of semantic segmentation models under realistic perturbations and unknown out-of-distribution (OOD) scenarios reveals interesting insights into the importance of large-scale pre-training and minimal architectural design in developing robust and reliable semantic segmentation models.

Abstract

We propose the unified BRAVO challenge to benchmark the reliability of semantic segmentation models under realistic perturbations and unknown out-of-distribution (OOD) scenarios. We define two categories of reliability: (1) semantic reliability, which reflects the model's accuracy and calibration when exposed to various perturbations; and (2) OOD reliability, which measures the model's ability to detect object classes that are unknown during training. The challenge attracted nearly 100 submissions from international teams representing notable research institutions. The results reveal interesting insights into the importance of large-scale pre-training and minimal architectural design in developing robust and reliable semantic segmentation models.

The BRAVO Semantic Segmentation Challenge Results in UNCV2024

TL;DR

The unified BRAVO challenge to benchmark the reliability of semantic segmentation models under realistic perturbations and unknown out-of-distribution (OOD) scenarios reveals interesting insights into the importance of large-scale pre-training and minimal architectural design in developing robust and reliable semantic segmentation models.

Abstract

We propose the unified BRAVO challenge to benchmark the reliability of semantic segmentation models under realistic perturbations and unknown out-of-distribution (OOD) scenarios. We define two categories of reliability: (1) semantic reliability, which reflects the model's accuracy and calibration when exposed to various perturbations; and (2) OOD reliability, which measures the model's ability to detect object classes that are unknown during training. The challenge attracted nearly 100 submissions from international teams representing notable research institutions. The results reveal interesting insights into the importance of large-scale pre-training and minimal architectural design in developing robust and reliable semantic segmentation models.
Paper Structure (22 sections, 6 figures, 10 tables)

This paper contains 22 sections, 6 figures, 10 tables.

Figures (6)

  • Figure 1: All submissions. Aggregated metrics (out-of-distribution and semantic) on axes, ranking metric (BRAVO Index) on level set. More freedom on the training dataset (Task 2, in orange) did not translate into better results.
  • Figure 2: Analysis showing the correlation of the summary metric of each BRAVO subset.
  • Figure 3: DINOv2-OOD Meta-approach. We take a pre-trained Vision Foundation Model (VFM), attach a simple segmentation decoder, and fine-tune the entire model for semantic segmentation. The segmentation decoder outputs both the per-pixel classification predictions and the associated confidence scores.
  • Figure 4: PixOOD Variants. Simplified block representation of the PixOOD framework for different submitted variants. From top to bottom: PixOOD, PixOOD w/ DeepLab Decoder and PixOOD w/ ResNet101 DeepLab. The blue color denotes blocks that are the same for all variants and are described in the PixOOD. The red color denotes the differences between the methods in the semantic segmentation branches.
  • Figure 5: PhyFea approach. Top left: illustration of the complete network architecture, where the cross-entropy loss of the baseline network is added to the losses of PhyFea. Bottom: the pipeline of PhyFea, where red-colored boxes are iterations for opening and yellow colored boxes are for selective dilation. Top right: legends for various components of PhyFea, such as the operations we apply in iterative manner for area opening and for selective dilation and the two functions to calculate the losses.
  • ...and 1 more figures