CARE: Ensemble Adversarial Robustness Evaluation Against Adaptive Attackers for Security Applications

Hangsheng Zhang; Jiqiang Liu; Jinsong Dong

CARE: Ensemble Adversarial Robustness Evaluation Against Adaptive Attackers for Security Applications

Hangsheng Zhang, Jiqiang Liu, Jinsong Dong

TL;DR

CARE introduces a comprehensive cybersecurity adversarial robustness evaluation framework that benchmarks ensemble defenses against adaptive attackers. The approach couples an attack library with a diverse model library and utility evaluations, enabling adaptive ensemble attacks (EMA/TEA) and robust automatic defenses (MA-AT/TE-AT) optimized via Bayesian methods to maximize defense performance under multiple attack types. The key contribution is Robus t Ensemble Adversarial Training (R-AT), which uses a Bayesian-optimized objective to select ensemble weights that resist multiple attacks and adaptive strategies, demonstrated across five security datasets and a range of detectors. The findings show that general ensembles alone cannot guarantee robustness, adaptive ensemble attacks can defeat simple defenses, and the proposed R-AT significantly improves defense resilience, enabling practical, scalable evaluation relevant to real-world security systems.

Abstract

Ensemble defenses, are widely employed in various security-related applications to enhance model performance and robustness. The widespread adoption of these techniques also raises many questions: Are general ensembles defenses guaranteed to be more robust than individuals? Will stronger adaptive attacks defeat existing ensemble defense strategies as the cybersecurity arms race progresses? Can ensemble defenses achieve adversarial robustness to different types of attacks simultaneously and resist the continually adjusted adaptive attacks? Unfortunately, these critical questions remain unresolved as there are no platforms for comprehensive evaluation of ensemble adversarial attacks and defenses in the cybersecurity domain. In this paper, we propose a general Cybersecurity Adversarial Robustness Evaluation (CARE) platform aiming to bridge this gap.

CARE: Ensemble Adversarial Robustness Evaluation Against Adaptive Attackers for Security Applications

TL;DR

Abstract

Paper Structure (51 sections, 4 equations, 11 figures, 6 tables, 1 algorithm)

This paper contains 51 sections, 4 equations, 11 figures, 6 tables, 1 algorithm.

introduction
background and related work
Adversarial Robustness Evaluation Platform
Ensemble Learning-based Security Detectors
Adversarial Attack for Security Detectors
Model Ensemble for Defenses
Ensemble Adversarial Training
preliminaries
Threat Model
Adversarial Attack for Security Detectors
Restricted Feature-Space attacks
End-to-end Problem-Space attacks
Attacks & Defenses
Attack Strategies.
Defense Strategies.
...and 36 more sections

Figures (11)

Figure 1: The CARE framework consists of three basic components and two ensemble robustness evaluation components. The three basic components are: 1) Attack Library (AL) 2) Model Library (ML) 3) Attack and Defense Utility Evaluation (AUE & DUE); the two ensemble robustness evaluation components are: 1) Adaptive Ensemble Attack Generation (AE-AG) 2) Automatic Ensemble Defense (AutoED).
Figure 2: Adversarial attack effectiveness on different datasets (higher is better).
Figure 3: The effects of attack cost of different attacks: Deep model changes more smoothly than tree model.
Figure 4: The attack effectiveness of End-to-end adversarial attack against MalConv.
Figure 5: To measure the effectiveness of adversarial training (AT) on four security datasets, we choose five models (MLP, Xgb., DeepEns, TreeEns, and HeteroEns). AT increases the defense success rate (DSR) while barely affecting the original detection rate (ODR), demonstrating that it is a mature defense technique.
...and 6 more figures

CARE: Ensemble Adversarial Robustness Evaluation Against Adaptive Attackers for Security Applications

TL;DR

Abstract

CARE: Ensemble Adversarial Robustness Evaluation Against Adaptive Attackers for Security Applications

Authors

TL;DR

Abstract

Table of Contents

Figures (11)