Table of Contents
Fetching ...

How Quality Affects Deep Neural Networks in Fine-Grained Image Classification

Joseph Smith, Zheming Zuo, Jonathan Stonehouse, Boguslaw Obara

TL;DR

This work tackles the sensitivity of fine-grained image classification to low-quality imagery by introducing a No-Reference Image Quality Assessment (NRIQA) guided cut-off point selection framework. It combines correlation-driven selection of robust NRIQA methods (notably LAPM, WAVS, and MUSIQ) with kernel-density-estimation-based cut-offs and a majority voting scheme to curate high-quality image subsets for training. Empirical results on packaging-outer-diameter datasets show consistent accuracy gains (0.7%–4.2%) when training on high-quality subsets, and demonstrate that a mixed-quality training set with roughly 30% removal of the lowest-quality images can maintain performance while reflecting real-world testing conditions. The approach offers a practical, scalable path to improve fine-grained classifiers in uncontrolled environments, with potential applicability to other domain tasks requiring quality-aware data selection.

Abstract

In this paper, we propose a No-Reference Image Quality Assessment (NRIQA) guided cut-off point selection (CPS) strategy to enhance the performance of a fine-grained classification system. Scores given by existing NRIQA methods on the same image may vary and not be as independent of natural image augmentations as expected, which weakens their connection and explainability to fine-grained image classification. Taking the three most commonly adopted image augmentation configurations -- cropping, rotating, and blurring -- as the entry point, we formulate a two-step mechanism for selecting the most discriminative subset from a given image dataset by considering both the confidence of model predictions and the density distribution of image qualities over several NRIQA methods. Concretely, the cut-off points yielded by those methods are aggregated via majority voting to inform the process of image subset selection. The efficacy and efficiency of such a mechanism have been confirmed by comparing the models being trained on high-quality images against a combination of high- and low-quality ones, with a range of 0.7% to 4.2% improvement on a commercial product dataset in terms of mean accuracy through four deep neural classifiers. The robustness of the mechanism has been proven by the observations that all the selected high-quality images can work jointly with 70% low-quality images with 1.3% of classification precision sacrificed when using ResNet34 in an ablation study.

How Quality Affects Deep Neural Networks in Fine-Grained Image Classification

TL;DR

This work tackles the sensitivity of fine-grained image classification to low-quality imagery by introducing a No-Reference Image Quality Assessment (NRIQA) guided cut-off point selection framework. It combines correlation-driven selection of robust NRIQA methods (notably LAPM, WAVS, and MUSIQ) with kernel-density-estimation-based cut-offs and a majority voting scheme to curate high-quality image subsets for training. Empirical results on packaging-outer-diameter datasets show consistent accuracy gains (0.7%–4.2%) when training on high-quality subsets, and demonstrate that a mixed-quality training set with roughly 30% removal of the lowest-quality images can maintain performance while reflecting real-world testing conditions. The approach offers a practical, scalable path to improve fine-grained classifiers in uncontrolled environments, with potential applicability to other domain tasks requiring quality-aware data selection.

Abstract

In this paper, we propose a No-Reference Image Quality Assessment (NRIQA) guided cut-off point selection (CPS) strategy to enhance the performance of a fine-grained classification system. Scores given by existing NRIQA methods on the same image may vary and not be as independent of natural image augmentations as expected, which weakens their connection and explainability to fine-grained image classification. Taking the three most commonly adopted image augmentation configurations -- cropping, rotating, and blurring -- as the entry point, we formulate a two-step mechanism for selecting the most discriminative subset from a given image dataset by considering both the confidence of model predictions and the density distribution of image qualities over several NRIQA methods. Concretely, the cut-off points yielded by those methods are aggregated via majority voting to inform the process of image subset selection. The efficacy and efficiency of such a mechanism have been confirmed by comparing the models being trained on high-quality images against a combination of high- and low-quality ones, with a range of 0.7% to 4.2% improvement on a commercial product dataset in terms of mean accuracy through four deep neural classifiers. The robustness of the mechanism has been proven by the observations that all the selected high-quality images can work jointly with 70% low-quality images with 1.3% of classification precision sacrificed when using ResNet34 in an ablation study.
Paper Structure (21 sections, 2 equations, 4 figures, 3 tables)

This paper contains 21 sections, 2 equations, 4 figures, 3 tables.

Figures (4)

  • Figure 1: A raw sample of the underside of a commercial product in Dataset 1, with printed codes shown on the bottom of the bottle, is to be progressively processed through three typical image augmentation configurations: (a) cropping, (b) rotating and (c) blurring.
  • Figure 2: Overall framework of the proposed two-step mechanism of high-quality image subset selection of the image dataset for improved fine-grained image classification. The upper row depicts the process of seeking the most appropriate subset of NRIQA methods specified in Section \ref{['sec:iqa_methods']} and the majority voting procedure of selecting high-quality images from a given dataset specified in Section \ref{['sec:cpsmeth']}. The bottom left box denotes how we used training sets made up of only high-quality and mixed-quality images to show how the accuracy of a model is affected by the quality of its training set in Section \ref{['sec:exp1meth']}. The bottom right box denotes how we employed a varying amount of low-quality images in the training set to find the optimal mix of low and high-quality images in Section \ref{['sec:exp2meth']}.
  • Figure 3: The scatter plots reveal the relationship between the image-wise quality score in the $x$-axis and the prediction confidence of the pretrained classification model in the $y$-axis for images in Dataset 2, as well as the corresponding density distribution of correct/incorrect predictions.
  • Figure 4: Explanations of the model predictions on low- and high-quality testing images.