Table of Contents
Fetching ...

Classes Are Not Equal: An Empirical Study on Image Recognition Fairness

Jiequan Cui, Beier Zhu, Xin Wen, Xiaojuan Qi, Bei Yu, Hanwang Zhang

TL;DR

It is concluded that data augmentation and representation learning algorithms improve overall performance by promoting fairness to some degree in image classification by improving overall performance by promoting fairness to some degree.

Abstract

In this paper, we present an empirical study on image recognition fairness, i.e., extreme class accuracy disparity on balanced data like ImageNet. We experimentally demonstrate that classes are not equal and the fairness issue is prevalent for image classification models across various datasets, network architectures, and model capacities. Moreover, several intriguing properties of fairness are identified. First, the unfairness lies in problematic representation rather than classifier bias. Second, with the proposed concept of Model Prediction Bias, we investigate the origins of problematic representation during optimization. Our findings reveal that models tend to exhibit greater prediction biases for classes that are more challenging to recognize. It means that more other classes will be confused with harder classes. Then the False Positives (FPs) will dominate the learning in optimization, thus leading to their poor accuracy. Further, we conclude that data augmentation and representation learning algorithms improve overall performance by promoting fairness to some degree in image classification. The Code is available at https://github.com/dvlab-research/Parametric-Contrastive-Learning.

Classes Are Not Equal: An Empirical Study on Image Recognition Fairness

TL;DR

It is concluded that data augmentation and representation learning algorithms improve overall performance by promoting fairness to some degree in image classification by improving overall performance by promoting fairness to some degree.

Abstract

In this paper, we present an empirical study on image recognition fairness, i.e., extreme class accuracy disparity on balanced data like ImageNet. We experimentally demonstrate that classes are not equal and the fairness issue is prevalent for image classification models across various datasets, network architectures, and model capacities. Moreover, several intriguing properties of fairness are identified. First, the unfairness lies in problematic representation rather than classifier bias. Second, with the proposed concept of Model Prediction Bias, we investigate the origins of problematic representation during optimization. Our findings reveal that models tend to exhibit greater prediction biases for classes that are more challenging to recognize. It means that more other classes will be confused with harder classes. Then the False Positives (FPs) will dominate the learning in optimization, thus leading to their poor accuracy. Further, we conclude that data augmentation and representation learning algorithms improve overall performance by promoting fairness to some degree in image classification. The Code is available at https://github.com/dvlab-research/Parametric-Contrastive-Learning.
Paper Structure (25 sections, 1 theorem, 10 equations, 16 figures, 6 tables)

This paper contains 25 sections, 1 theorem, 10 equations, 16 figures, 6 tables.

Key Result

Lemma 1

(Gradients Convergence Condition). When the training of model $\mathcal{M}$ converges, the gradients with respect to parameters will be $\mathbf{0}$. Then, the following equation is established. where $\mathbf{CF}[i]$ represents the number of samples in class i. Proof. With the rule of backpropagation, the gradients with sample $x_{i}$ on $\mathcal{M}(x_{i})$ is what follows, The gradients integ

Figures (16)

  • Figure 1: The unfairness is prevalent in image classification across various datasets, network architectures, and model capacities. We sort classes by the performance of WideResNet-34-10 on CIFAR-100 and ResNet-50 on ImageNet. For CLIP models, we sort classes by the zero-shot performance on ImageNet of the CLIP ResNet-50 model. Note that data rebalancing is considered in the collection of WIT-400M radford2021learning.
  • Figure 2: Data diversity imbalance. (a) Feature variance $\ell_2$-norm. (b) t-SNE visualization on ImageNet. We sort classes by the performance of ResNet-50 on ImageNet.
  • Figure 3: Analysis on ImageNet-LT. (a) Data distribution and per-class accuracy with ResNet-50 on ImageNet-LT; (b) The $\ell_2$-norm of classifier weights before/after classifier rebalancing; (c), (d), and (e) evaluate the prediction bias on "Many", "Medium", and "Few" classes data separately. We sort classes by the number of samples in the classes on ImageNet-LT.
  • Figure 4: Model prediction bias. (a) and (b) show the relationship between class accuracy and prediction bias. ImageNet-LT and ImageNet appear contrary conclusions. The details are discussed in \ref{['sec:prediction_bias']}. (c), (d), and (e) evaluate the prediction bias on "Easy", "Medium", and "Hard" classes data separately. ResNet-50 is used for ImageNet-LT while ViT-B is adopted on ImageNet. Classes are sorted by their frequency on ImageNet-LT. On ImageNet, classes are sorted by the performance of ResNet-50.
  • Figure 5: Analysis of fairness on bias from representation and classifier. (a) The $\ell_2$-norm of classifier weights on ImageNet; (b) Class mean angles defined in Eq. \ref{['eq:cma']}; (c) Unfairness exists with ResNet-50/101 and $k$-NN algorithm; (d) Unfairness exists with ResNet-50/101 and the ETF classifier; Classes are sorted by the performance of ResNet-50 on ImageNet.
  • ...and 11 more figures

Theorems & Definitions (3)

  • Definition 1
  • Definition 2
  • Lemma 1