Intelligent Multi-View Test Time Augmentation

Efe Ozturk; Mohit Prabhushankar; Ghassan AlRegib

Intelligent Multi-View Test Time Augmentation

Efe Ozturk, Mohit Prabhushankar, Ghassan AlRegib

TL;DR

This paper tackles the susceptibility of image classifiers to viewpoint-induced variability by introducing an uncertainty-guided Test Time Augmentation framework. The method, built as a two-stage process, first identifies a per-class optimal augmentation view using predictive uncertainty and then gates the application of TTA with a threshold $\tau$, combining $P_{default}$ and $P_{aug}$ when invoked. A class-wise augmentation preference vector $\mathbf{v}$ is learned via a matrix $\mathbf{S}$ that tallies the best augmentation per class, while Stage-2 leverages $P_{default}$, $P_{aug}$, and $P_{final}=\frac{P_{default}+P_{aug}}{2}$ to produce final predictions. Empirical results across CURE-OR, FacePix, and Brain Tumor MRI datasets using ResNet50, VGG-16, and ViT models show a mean gain of $1.73\%$ over single-view baselines and substantial improvements over random augmentation, with notable gains such as $4.4\%$–$7.26\%$ on select configurations. The work demonstrates that uncertainty-guided, class-aware TTA can enhance robustness and efficiency in real-world image classification tasks and motivates further exploration of intelligent augmentation strategies.

Abstract

In this study, we introduce an intelligent Test Time Augmentation (TTA) algorithm designed to enhance the robustness and accuracy of image classification models against viewpoint variations. Unlike traditional TTA methods that indiscriminately apply augmentations, our approach intelligently selects optimal augmentations based on predictive uncertainty metrics. This selection is achieved via a two-stage process: the first stage identifies the optimal augmentation for each class by evaluating uncertainty levels, while the second stage implements an uncertainty threshold to determine when applying TTA would be advantageous. This methodological advancement ensures that augmentations contribute to classification more effectively than a uniform application across the dataset. Experimental validation across several datasets and neural network architectures validates our approach, yielding an average accuracy improvement of 1.73% over methods that use single-view images. This research underscores the potential of adaptive, uncertainty-aware TTA in improving the robustness of image classification in the presence of viewpoint variations, paving the way for further exploration into intelligent augmentation strategies.

Intelligent Multi-View Test Time Augmentation

TL;DR

, combining

and

when invoked. A class-wise augmentation preference vector

is learned via a matrix

that tallies the best augmentation per class, while Stage-2 leverages

, and

to produce final predictions. Empirical results across CURE-OR, FacePix, and Brain Tumor MRI datasets using ResNet50, VGG-16, and ViT models show a mean gain of

over single-view baselines and substantial improvements over random augmentation, with notable gains such as

–

on select configurations. The work demonstrates that uncertainty-guided, class-aware TTA can enhance robustness and efficiency in real-world image classification tasks and motivates further exploration of intelligent augmentation strategies.

Abstract

Paper Structure (7 sections, 2 equations, 4 figures, 2 tables)

This paper contains 7 sections, 2 equations, 4 figures, 2 tables.

Introduction
Literature Review
Methodology
Stage-1: Optimal Augmentation View Selection
Stage-2: Uncertainty Assessment
Experiments and Results
Conclusion

Figures (4)

Figure 1: Comparison of Intelligent Multi-View TTA with the conventional single-view method. This illustrates how the intelligent approach dynamically selects augmentation views to refine predictions (P), in contrast to the conventional method's reliance on a single, static view.
Figure 2: The proposed algorithm. (a) Stage-1: optimal augmentation view selection. (b) Stage-2: uncertainty assessment.
Figure 3: Example images from the datasets CURE-OR, FacePix and Brain Tumor MRI. Default view and augmentation views are indicated for each of the datasets.
Figure 4: Threshold sweep of entropy for VGG-16 trained on CURE-OR.

Intelligent Multi-View Test Time Augmentation

TL;DR

Abstract

Intelligent Multi-View Test Time Augmentation

Authors

TL;DR

Abstract

Table of Contents

Figures (4)