Table of Contents
Fetching ...

A Bayesian Approach to OOD Robustness in Image Classification

Prakhar Kaushik, Adam Kortylewski, Alan Yuille

TL;DR

The paper addresses robustness of image classifiers to domain shifts and occlusion using a Bayesian framework built on Compositional Neural Networks. It introduces Unsupervised Generative Transition (UGT), leveraging a transitional dictionary of von Mises-Fisher kernels that capture object parts and enable unsupervised refinement to improve OOD performance. Evaluations across OOD-CV, ImageNet-C, and occlusion scenarios demonstrate state-of-the-art gains, including up to 10% top-1 accuracy on Occluded OOD-CV, indicating strong practical robustness under real-world nuisances without target-domain annotations.

Abstract

An important and unsolved problem in computer vision is to ensure that the algorithms are robust to changes in image domains. We address this problem in the scenario where we have access to images from the target domains but no annotations. Motivated by the challenges of the OOD-CV benchmark where we encounter real world Out-of-Domain (OOD) nuisances and occlusion, we introduce a novel Bayesian approach to OOD robustness for object classification. Our work extends Compositional Neural Networks (CompNets), which have been shown to be robust to occlusion but degrade badly when tested on OOD data. We exploit the fact that CompNets contain a generative head defined over feature vectors represented by von Mises-Fisher (vMF) kernels, which correspond roughly to object parts, and can be learned without supervision. We obverse that some vMF kernels are similar between different domains, while others are not. This enables us to learn a transitional dictionary of vMF kernels that are intermediate between the source and target domains and train the generative model on this dictionary using the annotations on the source domain, followed by iterative refinement. This approach, termed Unsupervised Generative Transition (UGT), performs very well in OOD scenarios even when occlusion is present. UGT is evaluated on different OOD benchmarks including the OOD-CV dataset, several popular datasets (e.g., ImageNet-C [9]), artificial image corruptions (including adding occluders), and synthetic-to-real domain transfer, and does well in all scenarios outperforming SOTA alternatives (e.g. up to 10% top-1 accuracy on Occluded OOD-CV dataset).

A Bayesian Approach to OOD Robustness in Image Classification

TL;DR

The paper addresses robustness of image classifiers to domain shifts and occlusion using a Bayesian framework built on Compositional Neural Networks. It introduces Unsupervised Generative Transition (UGT), leveraging a transitional dictionary of von Mises-Fisher kernels that capture object parts and enable unsupervised refinement to improve OOD performance. Evaluations across OOD-CV, ImageNet-C, and occlusion scenarios demonstrate state-of-the-art gains, including up to 10% top-1 accuracy on Occluded OOD-CV, indicating strong practical robustness under real-world nuisances without target-domain annotations.

Abstract

An important and unsolved problem in computer vision is to ensure that the algorithms are robust to changes in image domains. We address this problem in the scenario where we have access to images from the target domains but no annotations. Motivated by the challenges of the OOD-CV benchmark where we encounter real world Out-of-Domain (OOD) nuisances and occlusion, we introduce a novel Bayesian approach to OOD robustness for object classification. Our work extends Compositional Neural Networks (CompNets), which have been shown to be robust to occlusion but degrade badly when tested on OOD data. We exploit the fact that CompNets contain a generative head defined over feature vectors represented by von Mises-Fisher (vMF) kernels, which correspond roughly to object parts, and can be learned without supervision. We obverse that some vMF kernels are similar between different domains, while others are not. This enables us to learn a transitional dictionary of vMF kernels that are intermediate between the source and target domains and train the generative model on this dictionary using the annotations on the source domain, followed by iterative refinement. This approach, termed Unsupervised Generative Transition (UGT), performs very well in OOD scenarios even when occlusion is present. UGT is evaluated on different OOD benchmarks including the OOD-CV dataset, several popular datasets (e.g., ImageNet-C [9]), artificial image corruptions (including adding occluders), and synthetic-to-real domain transfer, and does well in all scenarios outperforming SOTA alternatives (e.g. up to 10% top-1 accuracy on Occluded OOD-CV dataset).
Paper Structure (18 sections, 2 equations, 2 figures, 1 table)

This paper contains 18 sections, 2 equations, 2 figures, 1 table.

Figures (2)

  • Figure 1: Example of caption. It is set in Roman so that mathematics (always set in Roman: $B \sin A = A \sin B$) may be included without an ugly clash.
  • Figure 2: Example of a short caption, which should be centered.