Table of Contents
Fetching ...

Rethinking Cost-sensitive Classification in Deep Learning via Adversarial Data Augmentation

Qiyuan Chen, Raed Al Kontar, Maher Nouiehed, Jessie Yang, Corey Lester

TL;DR

This work tackles cost-sensitive multiclass classification in over-parameterized DNNs, where standard training can erase cost distinctions due to perfect interpolation. It introduces CSADA, a cost-sensitive adversarial data augmentation framework that generates targeted perturbations $\delta^{(y,z)} = \arg \max_{\|\delta\| \le \epsilon} p_z(\theta; (x_i+\delta, y_i))$ and optimizes a penalized augmented loss $\ell_{augmented}(\theta;x,y,\delta) = \ell(f(\theta,x_i),y_i) + \lambda \sum_z \tilde c(y_i,z) \ell(f(\theta, x_i+\delta^{(y_i,z)}), y_i)$ with weights $\tilde c(y,z) = c(y,z)^\tau / \sum c(y,z)^\tau$. A stochastic variant reduces computation by sampling a single critical pair per batch. Empirically, CSADA lowers overall misclassification cost and critical errors on MNIST, CIFAR-10, and the PMI dataset while maintaining comparable accuracy, demonstrating a practical route to embed cost-awareness into deep classifiers and potentially other models.

Abstract

Cost-sensitive classification is critical in applications where misclassification errors widely vary in cost. However, over-parameterization poses fundamental challenges to the cost-sensitive modeling of deep neural networks (DNNs). The ability of a DNN to fully interpolate a training dataset can render a DNN, evaluated purely on the training set, ineffective in distinguishing a cost-sensitive solution from its overall accuracy maximization counterpart. This necessitates rethinking cost-sensitive classification in DNNs. To address this challenge, this paper proposes a cost-sensitive adversarial data augmentation (CSADA) framework to make over-parameterized models cost-sensitive. The overarching idea is to generate targeted adversarial examples that push the decision boundary in cost-aware directions. These targeted adversarial samples are generated by maximizing the probability of critical misclassifications and used to train a model with more conservative decisions on costly pairs. Experiments on well-known datasets and a pharmacy medication image (PMI) dataset made publicly available show that our method can effectively minimize the overall cost and reduce critical errors, while achieving comparable performance in terms of overall accuracy.

Rethinking Cost-sensitive Classification in Deep Learning via Adversarial Data Augmentation

TL;DR

This work tackles cost-sensitive multiclass classification in over-parameterized DNNs, where standard training can erase cost distinctions due to perfect interpolation. It introduces CSADA, a cost-sensitive adversarial data augmentation framework that generates targeted perturbations and optimizes a penalized augmented loss with weights . A stochastic variant reduces computation by sampling a single critical pair per batch. Empirically, CSADA lowers overall misclassification cost and critical errors on MNIST, CIFAR-10, and the PMI dataset while maintaining comparable accuracy, demonstrating a practical route to embed cost-awareness into deep classifiers and potentially other models.

Abstract

Cost-sensitive classification is critical in applications where misclassification errors widely vary in cost. However, over-parameterization poses fundamental challenges to the cost-sensitive modeling of deep neural networks (DNNs). The ability of a DNN to fully interpolate a training dataset can render a DNN, evaluated purely on the training set, ineffective in distinguishing a cost-sensitive solution from its overall accuracy maximization counterpart. This necessitates rethinking cost-sensitive classification in DNNs. To address this challenge, this paper proposes a cost-sensitive adversarial data augmentation (CSADA) framework to make over-parameterized models cost-sensitive. The overarching idea is to generate targeted adversarial examples that push the decision boundary in cost-aware directions. These targeted adversarial samples are generated by maximizing the probability of critical misclassifications and used to train a model with more conservative decisions on costly pairs. Experiments on well-known datasets and a pharmacy medication image (PMI) dataset made publicly available show that our method can effectively minimize the overall cost and reduce critical errors, while achieving comparable performance in terms of overall accuracy.
Paper Structure (19 sections, 1 theorem, 17 equations, 7 figures, 8 tables, 3 algorithms)

This paper contains 19 sections, 1 theorem, 17 equations, 7 figures, 8 tables, 3 algorithms.

Key Result

Theorem 1

Consider the objective defined in bi-level formulation with binary labels $\{y, \bar{y} = 1-y\}$ and $\ell(\cdot)$ being the cross-entropy loss. Then solving bi-level formulation is equivalent to solving the following min-max problem where $\delta = \left( \delta_{i} \right)$ for all $i \in \{1, \ldots ,N\}$.

Figures (7)

  • Figure 1: Similarity of Baseline and cost-sensitive model
  • Figure 2: Before (\ref{['fig:toy1']},\ref{['fig:toy2']},\ref{['fig:toy3']}) and After (\ref{['fig:toy4']},\ref{['fig:toy5']},\ref{['fig:toy6']}) CSADA
  • Figure 3: Comparison of Baseline, Penalty, CSADA models on MNIST
  • Figure 4: Comparison of Baseline, Penalty, CSADA Models on CIFAR-10
  • Figure 5: Cost Changes in Response to Hyperparameter $\lambda$
  • ...and 2 more figures

Theorems & Definitions (3)

  • Theorem 1
  • proof
  • Remark 2