Evaluating Bayesian Deep Learning Methods for Semantic Segmentation
Jishnu Mukhoti, Yarin Gal
TL;DR
The paper tackles the challenge of evaluating uncertainty in Bayesian deep learning for semantic segmentation. It proposes three specialized metrics and implements two Bayesian DeepLab-v3+ variants using MC dropout and Concrete dropout, evaluated on Cityscapes. Concrete dropout consistently outperforms MC dropout on the new metrics, and both Bayesian models exceed the deterministic baseline in uncertainty-aware performance. These results establish benchmarks for uncertainty quantification in safety-critical segmentation and motivate future work on downstream autonomous driving decisions.
Abstract
Deep learning has been revolutionary for computer vision and semantic segmentation in particular, with Bayesian Deep Learning (BDL) used to obtain uncertainty maps from deep models when predicting semantic classes. This information is critical when using semantic segmentation for autonomous driving for example. Standard semantic segmentation systems have well-established evaluation metrics. However, with BDL's rising popularity in computer vision we require new metrics to evaluate whether a BDL method produces better uncertainty estimates than another method. In this work we propose three such metrics to evaluate BDL models designed specifically for the task of semantic segmentation. We modify DeepLab-v3+, one of the state-of-the-art deep neural networks, and create its Bayesian counterpart using MC dropout and Concrete dropout as inference techniques. We then compare and test these two inference techniques on the well-known Cityscapes dataset using our suggested metrics. Our results provide new benchmarks for researchers to compare and evaluate their improved uncertainty quantification in pursuit of safer semantic segmentation.
