CaFA: Cost-aware, Feasible Attacks With Database Constraints Against Neural Tabular Classifiers
Matan Ben-Tov, Daniel Deutch, Nave Frost, Mahmood Sharif
TL;DR
CaFA addresses the gap between feature-space adversarial examples and realizable problem-space attacks on neural tabular classifiers by integrating two data-integrity constraint families—structure constraints and Denial Constraints (DCs)—into a cost-aware attack framework. It introduces TabPGD, a tabular-adapted PGD variant (with CWL0) that respects heterogeneous feature domains and minimizes perturbation costs, followed by SAT/Z3-based projection to ensure DC compliance. Empirical results on three datasets and two model architectures show that DC-based CaFA achieves higher feasible attack success with lower perturbation cost and comparable or faster runtimes than prior methods, and that constraint quality (soundness/completeness) can significantly influence realism and effectiveness. The work demonstrates practical robustness evaluation for deployed tabular models and releases CaFA as an open-source tool, enabling broader, domain-agnostic assessments of adversarial resilience in real-world settings.
Abstract
This work presents CaFA, a system for Cost-aware Feasible Attacks for assessing the robustness of neural tabular classifiers against adversarial examples realizable in the problem space, while minimizing adversaries' effort. To this end, CaFA leverages TabPGD$-$an algorithm we set forth to generate adversarial perturbations suitable for tabular data$-$ and incorporates integrity constraints automatically mined by state-of-the-art database methods. After producing adversarial examples in the feature space via TabPGD, CaFA projects them on the mined constraints, leading, in turn, to better attack realizability. We tested CaFA with three datasets and two architectures and found, among others, that the constraints we use are of higher quality (measured via soundness and completeness) than ones employed in prior work. Moreover, CaFA achieves higher feasible success rates$-$i.e., it generates adversarial examples that are often misclassified while satisfying constraints$-$than prior attacks while simultaneously perturbing few features with lower magnitudes, thus saving effort and improving inconspicuousness. We open-source CaFA, hoping it will serve as a generic system enabling machine-learning engineers to assess their models' robustness against realizable attacks, thus advancing deployed models' trustworthiness.
