Table of Contents
Fetching ...

Reduction of Class Activation Uncertainty with Background Information

H M Dipu Kabir

TL;DR

This work introduces a background class to reduce class activation uncertainty and improve generalization with lower computational cost than multitask learning. By generating diverse, non-target background images and training with an extra background output, the method shifts focus in the head layer toward robust, widely-activated features, as analyzed through CAM. Across varied datasets and architectures, including Vision Transformers, the approach yields improved or competitive accuracy with reduced training overhead and shows SOTA or near-SOTA results on several benchmarks. The study also discusses background-class generation principles, ablation results, and practical considerations, framing future work in uncertainty scores and broader applications.

Abstract

Multitask learning is a popular approach to training high-performing neural networks with improved generalization. In this paper, we propose a background class to achieve improved generalization at a lower computation compared to multitask learning to help researchers and organizations with limited computation power. We also present a methodology for selecting background images and discuss potential future improvements. We apply our approach to several datasets and achieve improved generalization with much lower computation. Through the class activation mappings (CAMs) of the trained models, we observed the tendency towards looking at a bigger picture with the proposed model training methodology. Applying the vision transformer with the proposed background class, we receive state-of-the-art (SOTA) performance on CIFAR-10C, Caltech-101, and CINIC-10 datasets. Example scripts are available in the `CAM' folder of the following GitHub Repository: github.com/dipuk0506/UQ

Reduction of Class Activation Uncertainty with Background Information

TL;DR

This work introduces a background class to reduce class activation uncertainty and improve generalization with lower computational cost than multitask learning. By generating diverse, non-target background images and training with an extra background output, the method shifts focus in the head layer toward robust, widely-activated features, as analyzed through CAM. Across varied datasets and architectures, including Vision Transformers, the approach yields improved or competitive accuracy with reduced training overhead and shows SOTA or near-SOTA results on several benchmarks. The study also discusses background-class generation principles, ablation results, and practical considerations, framing future work in uncertainty scores and broader applications.

Abstract

Multitask learning is a popular approach to training high-performing neural networks with improved generalization. In this paper, we propose a background class to achieve improved generalization at a lower computation compared to multitask learning to help researchers and organizations with limited computation power. We also present a methodology for selecting background images and discuss potential future improvements. We apply our approach to several datasets and achieve improved generalization with much lower computation. Through the class activation mappings (CAMs) of the trained models, we observed the tendency towards looking at a bigger picture with the proposed model training methodology. Applying the vision transformer with the proposed background class, we receive state-of-the-art (SOTA) performance on CIFAR-10C, Caltech-101, and CINIC-10 datasets. Example scripts are available in the `CAM' folder of the following GitHub Repository: github.com/dipuk0506/UQ
Paper Structure (31 sections, 5 equations, 7 figures, 5 tables)

This paper contains 31 sections, 5 equations, 7 figures, 5 tables.

Figures (7)

  • Figure 1: Models with both traditional training and training with background class are applied to a bird image (a) in the STL-10 dataset. Subplots (b) and (c) show class activation mapping of the bird class on the final convolutional layer respectively for traditional and for training with background situations. Subplots (d) and (e) show class activation mapping with the image. Subplots (f) and (g) show deep feature factorization results on the image for the traditional and proposed method respectively.
  • Figure 2: Rough diagrams to explain potential generalization improvement in multitask learning. The decision boundary in traditional learning (a) becomes overfitted. The decision boundary in multitask learning can potentially be more generalized (b) than the traditional one. However, multitask learning can potentially bring underfitting issues.
  • Figure 3: Rough diagrams presenting multitask learning, transfer learning, and proposed training with one or more background classes. In multitask learning (a), the child model is created from a parent model. Layers can be frozen, and the initialization of non-frozen layers can be transferred or randomly initialized. New layers can be added or removed. In traditional transfer learning (b), all initial layers are usually kept frozen. In proposed work (c), we start with a transfer learned model. Initial parameter values are copied from the pre-trained model. Layers are not frozen during the training. The head contains classes of the data and the background class.
  • Figure 4: Visualization of the Ablation Study. (a) Data Ablation: One input parameter is removed to observe the effect. (b) Model Ablation: A portion of the model is removed to observe the effect.
  • Figure 5: Shift of decision boundaries due to change in domains. A poorly trained NN can potentially perform well in the training domain and may perform poorly in test domains.
  • ...and 2 more figures