Extracting Usable Predictions from Quantized Networks through Uncertainty Quantification for OOD Detection

Rishi Singhal; Srinath Srinivasan

Extracting Usable Predictions from Quantized Networks through Uncertainty Quantification for OOD Detection

Rishi Singhal, Srinath Srinivasan

TL;DR

This work tackles OOD detection in quantized vision models under resource constraints by introducing an uncertainty quantification pipeline that applies inference-time Monte Carlo dropout to a fine-tuned backbone followed by post-training int8 quantization. It computes per-class confidence intervals using a Gaussian assumption, $ (\mu - Z\sigma, \mu + Z\sigma) $, with a configurable conf_factor to decide predictions and discard uncertain samples. The approach yields usable predictions by filtering out confusing inputs, improves F1 metrics on CIFAR-100 and CIFAR-100C, and achieves substantial model compression (~4x reduction in size) at the expense of increased inference time. This framework is practically significant for safety-critical tasks where reliable decision-making must be balanced against resource limitations.

Abstract

OOD detection has become more pertinent with advances in network design and increased task complexity. Identifying which parts of the data a given network is misclassifying has become as valuable as the network's overall performance. We can compress the model with quantization, but it suffers minor performance loss. The loss of performance further necessitates the need to derive the confidence estimate of the network's predictions. In line with this thinking, we introduce an Uncertainty Quantification(UQ) technique to quantify the uncertainty in the predictions from a pre-trained vision model. We subsequently leverage this information to extract valuable predictions while ignoring the non-confident predictions. We observe that our technique saves up to 80% of ignored samples from being misclassified. The code for the same is available here.

Extracting Usable Predictions from Quantized Networks through Uncertainty Quantification for OOD Detection

TL;DR

, with a configurable conf_factor to decide predictions and discard uncertain samples. The approach yields usable predictions by filtering out confusing inputs, improves F1 metrics on CIFAR-100 and CIFAR-100C, and achieves substantial model compression (~4x reduction in size) at the expense of increased inference time. This framework is practically significant for safety-critical tasks where reliable decision-making must be balanced against resource limitations.

Abstract

Paper Structure (16 sections, 3 equations, 8 figures, 14 tables)

This paper contains 16 sections, 3 equations, 8 figures, 14 tables.

Introduction
Related Works
Methodology
Experiments & Results
Experimental Setup
CIFAR-100 Dataset
CIFAR-100C Dataset
Implementation Details
Evaluation Metrics
Experiments
CIFAR-100 Experiments
CIFAR-100C Experiments
Inference Time Comparisons
Conclusion
Future Work
...and 1 more sections

Figures (8)

Figure 1: Representative Figure of the proposed UQ technique
Figure 2: Model architecure
Figure 3: Some images ignored by ResNet50 a) Correct: turtle, Predicted: shark; b) Correct: lawn_Mover, Predicted: sweet_pepper; c) Correct: boy, Predicted: girl
Figure 4: Some images ignored by EfficientNetB0 a) Correct: snake, Predicted: worm; b) Correct: seal, Predicted: otter; c) Correct: shark, Predicted: dolphin
Figure 5: Some images ignored by ResNet50 for Seversity 1 a) For Defocus Blur - Correct: apple, Predicted: orange; b) For Snow - Correct: mountain, Predicted: road; c) For Elastic Transform - Correct: streetcar, Predicted: train; d) For Gaussian Noise - Correct: cloud, Predicted: sea)
...and 3 more figures

Extracting Usable Predictions from Quantized Networks through Uncertainty Quantification for OOD Detection

TL;DR

Abstract

Extracting Usable Predictions from Quantized Networks through Uncertainty Quantification for OOD Detection

Authors

TL;DR

Abstract

Table of Contents

Figures (8)