Fine-Grained ImageNet Classification in the Wild

Maria Lymperaiou; Konstantinos Thomas; Giorgos Stamou

Fine-Grained ImageNet Classification in the Wild

Maria Lymperaiou, Konstantinos Thomas, Giorgos Stamou

TL;DR

This work tackles robustness of image classifiers under real-world distribution shifts by using uncurated web images to perform fine-grained ImageNet classification guided by WordNet hierarchies. It introduces a three-stage method: build a WordNet-guided, balanced web-image dataset without fine-tuning pre-trained CNNs and Transformers, and evaluate results with knowledge-driven metrics that quantify semantic similarity of misclassifications using $path(c_1, c_2)$, $LCH$, and $WUPS$. The study finds that accuracy alone fails to capture misclassification quality, with knowledge-driven metrics revealing whether errors are semantically related or distant, and showing transformers often align more closely with semantic relations than CNNs. The paper provides an explainable evaluation framework and a reproducible pipeline for assessing fine-grained classification under real-world conditions, highlighting practical implications for robust deployment and future research directions.

Abstract

Image classification has been one of the most popular tasks in Deep Learning, seeing an abundance of impressive implementations each year. However, there is a lot of criticism tied to promoting complex architectures that continuously push performance metrics higher and higher. Robustness tests can uncover several vulnerabilities and biases which go unnoticed during the typical model evaluation stage. So far, model robustness under distribution shifts has mainly been examined within carefully curated datasets. Nevertheless, such approaches do not test the real response of classifiers in the wild, e.g. when uncurated web-crawled image data of corresponding classes are provided. In our work, we perform fine-grained classification on closely related categories, which are identified with the help of hierarchical knowledge. Extensive experimentation on a variety of convolutional and transformer-based architectures reveals model robustness in this novel setting. Finally, hierarchical knowledge is again employed to evaluate and explain misclassifications, providing an information-rich evaluation scheme adaptable to any classifier.

Fine-Grained ImageNet Classification in the Wild

TL;DR

, and

. The study finds that accuracy alone fails to capture misclassification quality, with knowledge-driven metrics revealing whether errors are semantically related or distant, and showing transformers often align more closely with semantic relations than CNNs. The paper provides an explainable evaluation framework and a reproducible pipeline for assessing fine-grained classification under real-world conditions, highlighting practical implications for robust deployment and future research directions.

Abstract

Paper Structure (15 sections, 2 equations, 1 figure, 8 tables)

This paper contains 15 sections, 2 equations, 1 figure, 8 tables.

Introduction
Related work
Image classifiers
Robustness under distribution shifts
Method
Dataset creation
Classification
Explanations
Experiments
Convolutional classifiers
Transformer-based classifiers
Explaining misconceptions
Knowledge-driven metrics
Conclusion
More CNN misclassifications

Figures (1)

Figure 1: Outline of our method.

Fine-Grained ImageNet Classification in the Wild

TL;DR

Abstract

Fine-Grained ImageNet Classification in the Wild

Authors

TL;DR

Abstract

Table of Contents

Figures (1)