Table of Contents
Fetching ...

Improving Out-of-Distribution Detection with Disentangled Foreground and Background Features

Choubo Ding, Guansong Pang

TL;DR

This work addresses the limitation of foreground-only OOD detection by introducing disentangled foreground and background features for robust OOD detection. The proposed DFB framework uses a dense $(K+1)$-class predictor trained with weakly supervised pseudo masks to learn ID background features, and then converts to a $(K+1)$-class image classifier to obtain a background OOD score that complements existing foreground-based detectors. By fusing foreground and background OOD signals with a configurable temperature, DFB consistently boosts SotA methods on diverse OOD benchmarks, achieving new state-of-the-art performance while preserving or improving ID accuracy. The results advocate for holistic OOD detection that leverages both semantic foreground and non-semantic background information, with practical impact for safer deployment of vision systems in open-set settings.

Abstract

Detecting out-of-distribution (OOD) inputs is a principal task for ensuring the safety of deploying deep-neural-network classifiers in open-set scenarios. OOD samples can be drawn from arbitrary distributions and exhibit deviations from in-distribution (ID) data in various dimensions, such as foreground features (e.g., objects in CIFAR100 images vs. those in CIFAR10 images) and background features (e.g., textural images vs. objects in CIFAR10). Existing methods can confound foreground and background features in training, failing to utilize the background features for OOD detection. This paper considers the importance of feature disentanglement in out-of-distribution detection and proposes the simultaneous exploitation of both foreground and background features to support the detection of OOD inputs in in out-of-distribution detection. To this end, we propose a novel framework that first disentangles foreground and background features from ID training samples via a dense prediction approach, and then learns a new classifier that can evaluate the OOD scores of test images from both foreground and background features. It is a generic framework that allows for a seamless combination with various existing OOD detection methods. Extensive experiments show that our approach 1) can substantially enhance the performance of four different state-of-the-art (SotA) OOD detection methods on multiple widely-used OOD datasets with diverse background features, and 2) achieves new SotA performance on these benchmarks.

Improving Out-of-Distribution Detection with Disentangled Foreground and Background Features

TL;DR

This work addresses the limitation of foreground-only OOD detection by introducing disentangled foreground and background features for robust OOD detection. The proposed DFB framework uses a dense -class predictor trained with weakly supervised pseudo masks to learn ID background features, and then converts to a -class image classifier to obtain a background OOD score that complements existing foreground-based detectors. By fusing foreground and background OOD signals with a configurable temperature, DFB consistently boosts SotA methods on diverse OOD benchmarks, achieving new state-of-the-art performance while preserving or improving ID accuracy. The results advocate for holistic OOD detection that leverages both semantic foreground and non-semantic background information, with practical impact for safer deployment of vision systems in open-set settings.

Abstract

Detecting out-of-distribution (OOD) inputs is a principal task for ensuring the safety of deploying deep-neural-network classifiers in open-set scenarios. OOD samples can be drawn from arbitrary distributions and exhibit deviations from in-distribution (ID) data in various dimensions, such as foreground features (e.g., objects in CIFAR100 images vs. those in CIFAR10 images) and background features (e.g., textural images vs. objects in CIFAR10). Existing methods can confound foreground and background features in training, failing to utilize the background features for OOD detection. This paper considers the importance of feature disentanglement in out-of-distribution detection and proposes the simultaneous exploitation of both foreground and background features to support the detection of OOD inputs in in out-of-distribution detection. To this end, we propose a novel framework that first disentangles foreground and background features from ID training samples via a dense prediction approach, and then learns a new classifier that can evaluate the OOD scores of test images from both foreground and background features. It is a generic framework that allows for a seamless combination with various existing OOD detection methods. Extensive experiments show that our approach 1) can substantially enhance the performance of four different state-of-the-art (SotA) OOD detection methods on multiple widely-used OOD datasets with diverse background features, and 2) achieves new SotA performance on these benchmarks.
Paper Structure (12 sections, 8 equations, 7 figures, 5 tables)

This paper contains 12 sections, 8 equations, 7 figures, 5 tables.

Figures (7)

  • Figure 1: Saliency maps of ID (CIFAR10 krizhevsky2009learning) and OOD datasets (CIFAR100 krizhevsky2009learning, SVHN netzer2011reading, Places365 zhou2017places, Textures cimpoi2014describing) with vanilla classifier and our proposed DFB classifier. Vanilla classifiers tend to focus on objects unrelated to the ID class, e.g., the person on a horse (ID class), due to spurious correlation. In OOD data, vanilla classifiers struggle to localize objects within the image and treat the background features as foreground for ID classification. By disentangling foreground and background features, DFB effectively addresses these issue.
  • Figure 2: Overview of our proposed framework. It first uses a trained $K$-class classification network to obtain pseudo semantic segmentation masks and then learns the in-distribution features by training a $(K+1)$-class classification network with the pseudo labels (Left). It lastly converts the dense prediction network to a $(K+1)$-class classifier in a lossless fashion, and leverages these $(K+1)$ prediction outputs for joint foreground and background OOD detection (Bottom Right).
  • Figure 3: Lossless conversion of a dense prediction network to a classification network.
  • Figure 4: Distribution of the foreground/background OOD scores of ID (CIFAR10/100) and OOD samples (Textures) in DFB.
  • Figure 5: t-SNE visualization of the features learned by the vanilla classification network and DFB, where the colored dots are ID samples of different classes, and the black $\times$ are OOD samples.
  • ...and 2 more figures