Table of Contents
Fetching ...

Extracting Usable Predictions from Quantized Networks through Uncertainty Quantification for OOD Detection

Rishi Singhal, Srinath Srinivasan

TL;DR

This work tackles OOD detection in quantized vision models under resource constraints by introducing an uncertainty quantification pipeline that applies inference-time Monte Carlo dropout to a fine-tuned backbone followed by post-training int8 quantization. It computes per-class confidence intervals using a Gaussian assumption, $ (\mu - Z\sigma, \mu + Z\sigma) $, with a configurable conf_factor to decide predictions and discard uncertain samples. The approach yields usable predictions by filtering out confusing inputs, improves F1 metrics on CIFAR-100 and CIFAR-100C, and achieves substantial model compression (~4x reduction in size) at the expense of increased inference time. This framework is practically significant for safety-critical tasks where reliable decision-making must be balanced against resource limitations.

Abstract

OOD detection has become more pertinent with advances in network design and increased task complexity. Identifying which parts of the data a given network is misclassifying has become as valuable as the network's overall performance. We can compress the model with quantization, but it suffers minor performance loss. The loss of performance further necessitates the need to derive the confidence estimate of the network's predictions. In line with this thinking, we introduce an Uncertainty Quantification(UQ) technique to quantify the uncertainty in the predictions from a pre-trained vision model. We subsequently leverage this information to extract valuable predictions while ignoring the non-confident predictions. We observe that our technique saves up to 80% of ignored samples from being misclassified. The code for the same is available here.

Extracting Usable Predictions from Quantized Networks through Uncertainty Quantification for OOD Detection

TL;DR

This work tackles OOD detection in quantized vision models under resource constraints by introducing an uncertainty quantification pipeline that applies inference-time Monte Carlo dropout to a fine-tuned backbone followed by post-training int8 quantization. It computes per-class confidence intervals using a Gaussian assumption, , with a configurable conf_factor to decide predictions and discard uncertain samples. The approach yields usable predictions by filtering out confusing inputs, improves F1 metrics on CIFAR-100 and CIFAR-100C, and achieves substantial model compression (~4x reduction in size) at the expense of increased inference time. This framework is practically significant for safety-critical tasks where reliable decision-making must be balanced against resource limitations.

Abstract

OOD detection has become more pertinent with advances in network design and increased task complexity. Identifying which parts of the data a given network is misclassifying has become as valuable as the network's overall performance. We can compress the model with quantization, but it suffers minor performance loss. The loss of performance further necessitates the need to derive the confidence estimate of the network's predictions. In line with this thinking, we introduce an Uncertainty Quantification(UQ) technique to quantify the uncertainty in the predictions from a pre-trained vision model. We subsequently leverage this information to extract valuable predictions while ignoring the non-confident predictions. We observe that our technique saves up to 80% of ignored samples from being misclassified. The code for the same is available here.
Paper Structure (16 sections, 3 equations, 8 figures, 14 tables)

This paper contains 16 sections, 3 equations, 8 figures, 14 tables.

Figures (8)

  • Figure 1: Representative Figure of the proposed UQ technique
  • Figure 2: Model architecure
  • Figure 3: Some images ignored by ResNet50 a) Correct: turtle, Predicted: shark; b) Correct: lawn_Mover, Predicted: sweet_pepper; c) Correct: boy, Predicted: girl
  • Figure 4: Some images ignored by EfficientNetB0 a) Correct: snake, Predicted: worm; b) Correct: seal, Predicted: otter; c) Correct: shark, Predicted: dolphin
  • Figure 5: Some images ignored by ResNet50 for Seversity 1 a) For Defocus Blur - Correct: apple, Predicted: orange; b) For Snow - Correct: mountain, Predicted: road; c) For Elastic Transform - Correct: streetcar, Predicted: train; d) For Gaussian Noise - Correct: cloud, Predicted: sea)
  • ...and 3 more figures