Table of Contents
Fetching ...

Supervised Multilabel Image Classification Using Residual Networks with Probabilistic Reasoning

Lokender Singh, Saksham Kumar, Chandan Kumar

TL;DR

This work tackles multilabel image classification by integrating a probabilistic reasoning layer with a ResNet-101 backbone to model label co-occurrence and uncertainty. The method outputs per-label probabilities via $P(y_i|x) = σ(f(x))$ rather than independent binary decisions, enabling more accurate joint label predictions. On COCO-2014, it achieves an mAP of $0.794$, outperforming ResNet-101 SRN ($0.771$) and Vision Transformer baselines ($0.785$), with favorable precision, recall, and F1 scores. This probabilistic integration enhances interpretability of label dependencies while maintaining computational efficiency, offering a practical boost for complex scene understanding and downstream tasks.

Abstract

Multilabel image categorization has drawn interest recently because of its numerous computer vision applications. The proposed work introduces a novel method for classifying multilabel images using the COCO-2014 dataset and a modified ResNet-101 architecture. By simulating label dependencies and uncertainties, the approach uses probabilistic reasoning to improve prediction accuracy. Extensive tests show that the model outperforms earlier techniques and approaches to state-of-the-art outcomes in multilabel categorization. The work also thoroughly assesses the model's performance using metrics like precision-recall score and achieves 0.794 mAP on COCO-2014, outperforming ResNet-SRN (0.771) and Vision Transformer baselines (0.785). The novelty of the work lies in integrating probabilistic reasoning into deep learning models to effectively address the challenges presented by multilabel scenarios.

Supervised Multilabel Image Classification Using Residual Networks with Probabilistic Reasoning

TL;DR

This work tackles multilabel image classification by integrating a probabilistic reasoning layer with a ResNet-101 backbone to model label co-occurrence and uncertainty. The method outputs per-label probabilities via rather than independent binary decisions, enabling more accurate joint label predictions. On COCO-2014, it achieves an mAP of , outperforming ResNet-101 SRN () and Vision Transformer baselines (), with favorable precision, recall, and F1 scores. This probabilistic integration enhances interpretability of label dependencies while maintaining computational efficiency, offering a practical boost for complex scene understanding and downstream tasks.

Abstract

Multilabel image categorization has drawn interest recently because of its numerous computer vision applications. The proposed work introduces a novel method for classifying multilabel images using the COCO-2014 dataset and a modified ResNet-101 architecture. By simulating label dependencies and uncertainties, the approach uses probabilistic reasoning to improve prediction accuracy. Extensive tests show that the model outperforms earlier techniques and approaches to state-of-the-art outcomes in multilabel categorization. The work also thoroughly assesses the model's performance using metrics like precision-recall score and achieves 0.794 mAP on COCO-2014, outperforming ResNet-SRN (0.771) and Vision Transformer baselines (0.785). The novelty of the work lies in integrating probabilistic reasoning into deep learning models to effectively address the challenges presented by multilabel scenarios.

Paper Structure

This paper contains 16 sections, 7 equations, 3 figures, 3 tables.

Figures (3)

  • Figure 1: Multilabel Image Classification Model Architecture
  • Figure 2: Food Items in a Lunchbox
  • Figure 3: Giraffes in the wild