Adversarial Machine Learning: Bayesian Perspectives
David Rios Insua, Roi Naveiro, Victor Gallego, Jason Poulos
TL;DR
The paper addresses adversarial vulnerabilities in ML and the inadequacy of common knowledge ($CK$) in CK-based AML models. It proposes a Bayesian Adversarial Risk Analysis ($ARA$) framework that explicitly models uncertainty about attacker beliefs and goals, integrating into supervised learning via $p(y|x)$ and $p(x'|x)$. The approach yields a robust posterior predictive mechanism ($RAPPD$) for operation and a robust adversarial posterior ($RAPD$) for training, with practical algorithms based on Approximate Bayesian Computation (ABC) and stochastic gradient methods. Case studies in spam detection and vision tasks demonstrate improved robustness, particularly when defender uncertainty about attacker behavior is incorporated.
Abstract
Adversarial Machine Learning (AML) is emerging as a major field aimed at protecting machine learning (ML) systems against security threats: in certain scenarios there may be adversaries that actively manipulate input data to fool learning systems. This creates a new class of security vulnerabilities that ML systems may face, and a new desirable property called adversarial robustness essential to trust operations based on ML outputs. Most work in AML is built upon a game-theoretic modelling of the conflict between a learning system and an adversary, ready to manipulate input data. This assumes that each agent knows their opponent's interests and uncertainty judgments, facilitating inferences based on Nash equilibria. However, such common knowledge assumption is not realistic in the security scenarios typical of AML. After reviewing such game-theoretic approaches, we discuss the benefits that Bayesian perspectives provide when defending ML-based systems. We demonstrate how the Bayesian approach allows us to explicitly model our uncertainty about the opponent's beliefs and interests, relaxing unrealistic assumptions, and providing more robust inferences. We illustrate this approach in supervised learning settings, and identify relevant future research problems.
