Robustifying ML-powered Network Classifiers with PANTS

Minhao Jin; Maria Apostolaki

Robustifying ML-powered Network Classifiers with PANTS

Minhao Jin, Maria Apostolaki

TL;DR

ML-powered network classifiers are vulnerable to adversarial inputs, and existing AML methods struggle to ensure realizability and semantic preservation in network settings. PANTS integrates gradient-based AML (e.g., PGD, ZOO) with an SMT solver to generate adversarial, realizable, and semantics-preserving packet sequences and embeds them into an iterative adversarial training loop. It achieves higher adversarial-sample discovery and robustness gains than state-of-the-art baselines (e.g., median ASR $=35.31\%$, up to $52.72\%$ robustness improvement) while maintaining accuracy, and it remains effective against stronger or different attackers; it is open-sourced for practical deployment. The approach is practical across pipelines with non-differentiable feature engineering and non-end-to-end differentiable components, offering a concrete path for operators to assess and harden MNCs in real networks.

Abstract

Multiple network management tasks, from resource allocation to intrusion detection, rely on some form of ML-based network traffic classification (MNC). Despite their potential, MNCs are vulnerable to adversarial inputs, which can lead to outages, poor decision-making, and security violations, among other issues. The goal of this paper is to help network operators assess and enhance the robustness of their MNC against adversarial inputs. The most critical step for this is generating inputs that can fool the MNC while being realizable under various threat models. Compared to other ML models, finding adversarial inputs against MNCs is more challenging due to the existence of non-differentiable components e.g., traffic engineering and the need to constrain inputs to preserve semantics and ensure reliability. These factors prevent the direct use of well-established gradient-based methods developed in adversarial ML (AML). To address these challenges, we introduce PANTS, a practical white-box framework that uniquely integrates AML techniques with Satisfiability Modulo Theories (SMT) solvers to generate adversarial inputs for MNCs. We also embed PANTS into an iterative adversarial training process that enhances the robustness of MNCs against adversarial inputs. PANTS is 70% and 2x more likely in median to find adversarial inputs against target MNCs compared to state-of-the-art baselines, namely Amoeba and BAP. PANTS improves the robustness of the target MNCs by 52.7% (even against attackers outside of what is considered during robustification) without sacrificing their accuracy.

Robustifying ML-powered Network Classifiers with PANTS

TL;DR

, up to

robustness improvement) while maintaining accuracy, and it remains effective against stronger or different attackers; it is open-sourced for practical deployment. The approach is practical across pipelines with non-differentiable feature engineering and non-end-to-end differentiable components, offering a concrete path for operators to assess and harden MNCs in real networks.

Abstract

Paper Structure (28 sections, 16 figures, 3 tables, 1 algorithm)

This paper contains 28 sections, 16 figures, 3 tables, 1 algorithm.

Introduction
Motivation & Limitations of Existing Work
Motivating use case
Requirements
Limitations of existing work
Overview
Problem formulation
The promise of Adversarial ML
Adversarial ML limitations
PANTS: Adversarial ML for networking
Design
PANTS end-to-end view
SMT formulations
Adversarial packet sequence identification
Feature importance determination(line 1 - line 5 in Alg. \ref{['alg:feat_selection']}).
...and 13 more sections

Figures (16)

Figure 1: MNCs are vulnerable to adversarial inputs (circles). Traditional adversarial training sacrifices accuracy to reduce vulnerability (star). PANTS iterative adversarial-training reduces the vulnerability without hurting accuracy (cross).
Figure 2: Overview of PANTS workflow. PANTS generates adversarial inputs that are also used to iteratively train the target MNC. PANTS receives the implementation of a MNC together with a training dataset and a couple of rules that constrain the generated inputs. At its core, PANTS features an AML component that collaborates with an SMT solver.
Figure 3: Given a network flow p, the perturbation $\delta$ is applied to generate an adversarial flow $\delta(p)$, which causes the ML model to return a wrong output.
Figure 4: PANTS combines AML with an SMT solver to generate adversarial and realizable samples (flows), which are used to assess and enhance robustness.
Figure 5: The impact of $k$ in ASR (i.e., in PANTS ability to generate examples) against MLP and RF. The ASR increases with $k$, but levels off once $k$ reaches a certain threshold.
...and 11 more figures

Robustifying ML-powered Network Classifiers with PANTS

TL;DR

Abstract

Robustifying ML-powered Network Classifiers with PANTS

Authors

TL;DR

Abstract

Table of Contents

Figures (16)