Table of Contents
Fetching ...

TabAttackBench: A Benchmark for Adversarial Attacks on Tabular Data

Zhipeng He, Chun Ouyang, Lijie Wen, Cong Liu, Catarina Moreira

TL;DR

TabAttackBench introduces a unified benchmark for adversarial attacks on tabular data, evaluating five white-box attacks (FGSM, BIM, PGD, DeepFool, C&W) across four predictive models (LR, MLP, TabTransformer, FT-Transformer) on 11 datasets. It jointly measures effectiveness (attack success rate) and imperceptibility using four metrics (Proximity, Sparsity, Deviation, Sensitivity), revealing a clear trade-off: ℓ∞-based attacks tend to be more effective but less imperceptible, while ℓ2-based attacks produce more realistic perturbations. The framework exposes dataset- and model-dependent patterns, including pronounced numerical feature perturbation, proximity and deviation dynamics, and transformer-model robustness, providing actionable insights for designing more imperceptible attacks and for developing robust defenses. By offering standardized preprocessing, reproducible pipelines, and open resources, the paper establishes a practical reference for tabular adversarial robustness research and future benchmark development.

Abstract

Adversarial attacks pose a significant threat to machine learning models by inducing incorrect predictions through imperceptible perturbations to input data. While these attacks are well studied in unstructured domains such as images, their behaviour on tabular data remains underexplored due to mixed feature types and complex inter-feature dependencies. This study introduces a comprehensive benchmark that evaluates adversarial attacks on tabular datasets with respect to both effectiveness and imperceptibility. We assess five white-box attack algorithms (FGSM, BIM, PGD, DeepFool, and C\&W) across four representative models (LR, MLP, TabTransformer and FT-Transformer) using eleven datasets spanning finance, energy, and healthcare domains. The benchmark employs four quantitative imperceptibility metrics (proximity, sparsity, deviation, and sensitivity) to characterise perturbation realism. The analysis quantifies the trade-off between these two aspects and reveals consistent differences between attack types, with $\ell_\infty$-based attacks achieving higher success but lower subtlety, and $\ell_2$-based attacks offering more realistic perturbations. The benchmark findings offer actionable insights for designing more imperceptible adversarial attacks, advancing the understanding of adversarial vulnerability in tabular machine learning.

TabAttackBench: A Benchmark for Adversarial Attacks on Tabular Data

TL;DR

TabAttackBench introduces a unified benchmark for adversarial attacks on tabular data, evaluating five white-box attacks (FGSM, BIM, PGD, DeepFool, C&W) across four predictive models (LR, MLP, TabTransformer, FT-Transformer) on 11 datasets. It jointly measures effectiveness (attack success rate) and imperceptibility using four metrics (Proximity, Sparsity, Deviation, Sensitivity), revealing a clear trade-off: ℓ∞-based attacks tend to be more effective but less imperceptible, while ℓ2-based attacks produce more realistic perturbations. The framework exposes dataset- and model-dependent patterns, including pronounced numerical feature perturbation, proximity and deviation dynamics, and transformer-model robustness, providing actionable insights for designing more imperceptible attacks and for developing robust defenses. By offering standardized preprocessing, reproducible pipelines, and open resources, the paper establishes a practical reference for tabular adversarial robustness research and future benchmark development.

Abstract

Adversarial attacks pose a significant threat to machine learning models by inducing incorrect predictions through imperceptible perturbations to input data. While these attacks are well studied in unstructured domains such as images, their behaviour on tabular data remains underexplored due to mixed feature types and complex inter-feature dependencies. This study introduces a comprehensive benchmark that evaluates adversarial attacks on tabular datasets with respect to both effectiveness and imperceptibility. We assess five white-box attack algorithms (FGSM, BIM, PGD, DeepFool, and C\&W) across four representative models (LR, MLP, TabTransformer and FT-Transformer) using eleven datasets spanning finance, energy, and healthcare domains. The benchmark employs four quantitative imperceptibility metrics (proximity, sparsity, deviation, and sensitivity) to characterise perturbation realism. The analysis quantifies the trade-off between these two aspects and reveals consistent differences between attack types, with -based attacks achieving higher success but lower subtlety, and -based attacks offering more realistic perturbations. The benchmark findings offer actionable insights for designing more imperceptible adversarial attacks, advancing the understanding of adversarial vulnerability in tabular machine learning.

Paper Structure

This paper contains 64 sections, 21 equations, 20 figures, 9 tables.

Figures (20)

  • Figure 1: A taxonomy of adversarial attack threat models. The taxonomy spans four primary dimensions: Adversary's Influence, Adversary's Knowledge, Adversary's Perturbation Constraints and Adversary's Goals. Highlighted categories indicate the specific the research scope in this benchmark.
  • Figure 2: Roadmap of the proposed evaluation framework for benchmarking adversarial attacks on tabular data. The pipeline extends standard machine learning workflows by adding an adversarial attack stage, linking dataset preparation, model training, attack generation, and evaluation to provide a comprehensive measure of effectiveness and imperceptibility.
  • Figure 3: Attack success rate (ASR) of evaluated attack methods on all three mixed datasets (Adult, COMPAS, Electricity) and four models. Electricity shows uniformly high vulnerability across all attacks and models, while Adult and COMPAS reveal clear differences between $\ell_\infty$-based (FGSM, BIM, PGD) and $\ell_2$-based (DeepFool, C&W attack) methods. Transformer models generally require larger perturbations to reach comparable success rates, indicating greater robustness than LR and MLP.
  • Figure 4: Attack success rate (ASR) of evaluated attack methods on two (out of eight) numerical datasets and four ML models.
  • Figure 6: (Cont.) Attack success rate (ASR) of evaluated attack methods on the remaining three (out of eight) numerical datasets and four ML models.
  • ...and 15 more figures

Theorems & Definitions (14)

  • definition 1: Perturbation Set
  • definition 2: Adversarial Example
  • definition 3: Adversarial Attack
  • definition 4: Unbounded Adversarial Attack
  • definition 5: Bounded Adversarial Attack
  • definition 6: FGSM
  • definition 7: BIM and PGD
  • definition 8: C&W Attack
  • definition 9: DeepFool
  • definition 10: Proximity
  • ...and 4 more