Rethinking Cost-sensitive Classification in Deep Learning via Adversarial Data Augmentation

Qiyuan Chen; Raed Al Kontar; Maher Nouiehed; Jessie Yang; Corey Lester

Rethinking Cost-sensitive Classification in Deep Learning via Adversarial Data Augmentation

Qiyuan Chen, Raed Al Kontar, Maher Nouiehed, Jessie Yang, Corey Lester

TL;DR

This work tackles cost-sensitive multiclass classification in over-parameterized DNNs, where standard training can erase cost distinctions due to perfect interpolation. It introduces CSADA, a cost-sensitive adversarial data augmentation framework that generates targeted perturbations $\delta^{(y,z)} = \arg \max_{\|\delta\| \le \epsilon} p_z(\theta; (x_i+\delta, y_i))$ and optimizes a penalized augmented loss $\ell_{augmented}(\theta;x,y,\delta) = \ell(f(\theta,x_i),y_i) + \lambda \sum_z \tilde c(y_i,z) \ell(f(\theta, x_i+\delta^{(y_i,z)}), y_i)$ with weights $\tilde c(y,z) = c(y,z)^\tau / \sum c(y,z)^\tau$. A stochastic variant reduces computation by sampling a single critical pair per batch. Empirically, CSADA lowers overall misclassification cost and critical errors on MNIST, CIFAR-10, and the PMI dataset while maintaining comparable accuracy, demonstrating a practical route to embed cost-awareness into deep classifiers and potentially other models.

Abstract

Cost-sensitive classification is critical in applications where misclassification errors widely vary in cost. However, over-parameterization poses fundamental challenges to the cost-sensitive modeling of deep neural networks (DNNs). The ability of a DNN to fully interpolate a training dataset can render a DNN, evaluated purely on the training set, ineffective in distinguishing a cost-sensitive solution from its overall accuracy maximization counterpart. This necessitates rethinking cost-sensitive classification in DNNs. To address this challenge, this paper proposes a cost-sensitive adversarial data augmentation (CSADA) framework to make over-parameterized models cost-sensitive. The overarching idea is to generate targeted adversarial examples that push the decision boundary in cost-aware directions. These targeted adversarial samples are generated by maximizing the probability of critical misclassifications and used to train a model with more conservative decisions on costly pairs. Experiments on well-known datasets and a pharmacy medication image (PMI) dataset made publicly available show that our method can effectively minimize the overall cost and reduce critical errors, while achieving comparable performance in terms of overall accuracy.

Rethinking Cost-sensitive Classification in Deep Learning via Adversarial Data Augmentation

TL;DR

and optimizes a penalized augmented loss

with weights

. A stochastic variant reduces computation by sampling a single critical pair per batch. Empirically, CSADA lowers overall misclassification cost and critical errors on MNIST, CIFAR-10, and the PMI dataset while maintaining comparable accuracy, demonstrating a practical route to embed cost-awareness into deep classifiers and potentially other models.

Abstract

Paper Structure (19 sections, 1 theorem, 17 equations, 7 figures, 8 tables, 3 algorithms)

This paper contains 19 sections, 1 theorem, 17 equations, 7 figures, 8 tables, 3 algorithms.

Introduction
A Simple Motivational Example
Literature Overview
Cost-Sensitive Learning
Adversarial Attacks and Adversarial training
Model Development
Relation to Adversarial Training
Multi-step Gradient Descent Ascent with Rejection
Stochastic Multi-Ascent Descent with Rejection
Proof of Concept
Experiments
The Failure of Simple Reweighting
Cost-sensitive Training on CIFAR-10 and MNIST
Pharmacy Medication Image (PMI) Dataset
Medication Dispensing Errors Overview
...and 4 more sections

Key Result

Theorem 1

Consider the objective defined in bi-level formulation with binary labels $\{y, \bar{y} = 1-y\}$ and $\ell(\cdot)$ being the cross-entropy loss. Then solving bi-level formulation is equivalent to solving the following min-max problem where $\delta = \left( \delta_{i} \right)$ for all $i \in \{1, \ldots ,N\}$.

Figures (7)

Figure 1: Similarity of Baseline and cost-sensitive model
Figure 2: Before (\ref{['fig:toy1']},\ref{['fig:toy2']},\ref{['fig:toy3']}) and After (\ref{['fig:toy4']},\ref{['fig:toy5']},\ref{['fig:toy6']}) CSADA
Figure 3: Comparison of Baseline, Penalty, CSADA models on MNIST
Figure 4: Comparison of Baseline, Penalty, CSADA Models on CIFAR-10
Figure 5: Cost Changes in Response to Hyperparameter $\lambda$
...and 2 more figures

Theorems & Definitions (3)

Theorem 1
proof
Remark 2

Rethinking Cost-sensitive Classification in Deep Learning via Adversarial Data Augmentation

TL;DR

Abstract

Rethinking Cost-sensitive Classification in Deep Learning via Adversarial Data Augmentation

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (7)

Theorems & Definitions (3)