Table of Contents
Fetching ...

PCNN: Probable-Class Nearest-Neighbor Explanations Improve Fine-Grained Image Classification Accuracy for AIs and Humans

Giang, Nguyen, Valerie Chen, Mohammad Reza Taesiri, Anh Totti Nguyen

TL;DR

PCNN tackles misleading top-1 explanations in fine-grained image classification by using Probable-Class Nearest Neighbors drawn from the top-$K$ predictions of a frozen classifier. It trains an image comparator S on PCNN pairs and combines C and S in a Product of Experts to re-rank predictions, yielding consistent gains across CUB-200, Cars-196, and Dogs-120. The approach generalizes to unseen classifiers and improves human decision-making, as a human study shows PCNN reduces AI over-reliance and increases decision accuracy. While introducing runtime overhead, thresholding and data-size reductions mitigate costs, making PCNN a practical tool for enhancing AI decisions and human-AI collaboration in fine-grained tasks.

Abstract

Nearest neighbors (NN) are traditionally used to compute final decisions, e.g., in Support Vector Machines or k-NN classifiers, and to provide users with explanations for the model's decision. In this paper, we show a novel utility of nearest neighbors: To improve predictions of a frozen, pretrained image classifier C. We leverage an image comparator S that (1) compares the input image with NN images from the top-K most probable classes given by C; and (2) uses scores from S to weight the confidence scores of C to refine predictions. Our method consistently improves fine-grained image classification accuracy on CUB-200, Cars-196, and Dogs-120. Also, a human study finds that showing users our probable-class nearest neighbors (PCNN) reduces over-reliance on AI, thus improving their decision accuracy over prior work which only shows only the most-probable (top-1) class examples.

PCNN: Probable-Class Nearest-Neighbor Explanations Improve Fine-Grained Image Classification Accuracy for AIs and Humans

TL;DR

PCNN tackles misleading top-1 explanations in fine-grained image classification by using Probable-Class Nearest Neighbors drawn from the top- predictions of a frozen classifier. It trains an image comparator S on PCNN pairs and combines C and S in a Product of Experts to re-rank predictions, yielding consistent gains across CUB-200, Cars-196, and Dogs-120. The approach generalizes to unseen classifiers and improves human decision-making, as a human study shows PCNN reduces AI over-reliance and increases decision accuracy. While introducing runtime overhead, thresholding and data-size reductions mitigate costs, making PCNN a practical tool for enhancing AI decisions and human-AI collaboration in fine-grained tasks.

Abstract

Nearest neighbors (NN) are traditionally used to compute final decisions, e.g., in Support Vector Machines or k-NN classifiers, and to provide users with explanations for the model's decision. In this paper, we show a novel utility of nearest neighbors: To improve predictions of a frozen, pretrained image classifier C. We leverage an image comparator S that (1) compares the input image with NN images from the top-K most probable classes given by C; and (2) uses scores from S to weight the confidence scores of C to refine predictions. Our method consistently improves fine-grained image classification accuracy on CUB-200, Cars-196, and Dogs-120. Also, a human study finds that showing users our probable-class nearest neighbors (PCNN) reduces over-reliance on AI, thus improving their decision accuracy over prior work which only shows only the most-probable (top-1) class examples.
Paper Structure (55 sections, 4 equations, 28 figures, 17 tables, 2 algorithms)

This paper contains 55 sections, 4 equations, 28 figures, 17 tables, 2 algorithms.

Figures (28)

  • Figure 1: Given an input image $x$ and a black-box, pretrained classifier $\mathbf{C}$ that predicts the label for $x$. Prior work (a) often shows only the nearest neighbors from the top-1 predicted class as explanations for the decision, which often fools humans into accepting wrong decisions (here, Caspian Tern) due to the similarity between the input and top-1 class examples. Instead, including extra nearest neighbors (b) from top-2 to top-$K$ classes improves not only human accuracy on this binary distinction task but also AI's accuracy on standard fine-grained image classification tasks (see \ref{['fig:re-ranking_algo']}).
  • Figure 2: $\mathbf{C}$$\times$$\mathbf{S}$ re-ranking algorithm: From each class among the top-$K$ predicted classes by $\mathbf{C}$, we find the nearest neighbor $nn$ to the query $x$ and compute a sigmoid similarity score $\mathbf{S}(x, nn)$, which weights the original $\mathbf{C}\xspace(x)$ probabilities, re-ranking the labels. See \ref{['alg:poe_advising_process']} for the written algorithm.
  • Figure 3: Our comparator takes in a pair of images $(x, nn)$ and outputs a sigmoid score $\mathbf{s} = \mathbf{S}\xspace(x, nn) \in [0, 1]$ indicating whether two images belong to the same class. $L$, $M$, and $N$ are the depths of the respective blocks.
  • Figure 4: For each training-set image $x$, we sample $Q$ nearest images from the groundtruth class of $x$ to form $Q$positive pairs $\{ (x, nn_{\textcolor{green!40!black}{+}}\xspace^i) \}^{i=1}_{Q}$. To sample $Q$hard, negative pairs: Per non-groundtruth class among the top-$Q$ predicted classes from $\mathbf{C}\xspace(x)$, we take the nearest image to the input. Here, when the groundtruth label (Elegant Tern) is among the top-$Q$ labels, there would be only $Q -1$ negative pairs.
  • Figure 5: $\mathbf{C}\xspace \times \mathbf{S}\xspace$ model successfully corrects originally wrong predictions made by ResNet-50.
  • ...and 23 more figures