A Geometric Framework for Adversarial Vulnerability in Machine Learning
Brian Bell
TL;DR
The work addresses adversarial vulnerability in neural networks by proposing a geometric framework that blends high-dimensional curvature, persistence, and decision-boundary analysis to formalize robustness questions, including the Dimpled Manifold Hypothesis. It develops exact path kernel representations and their generalized form (EPK and gEPK) to decompose predictions along training paths and across data, enabling finite-width networks to be analyzed with kernel methods. Novel metrics such as $(oldsymbol{\gamma},oldsymbol{\sigma})$-stability and $oldsymbol{\gamma}$-persistence quantify local robustness, and experiments on MNIST and ImageNet reveal that adversarial examples exhibit reduced persistence and distinctive decision-boundary angles; the framework also links to Out-of-Distribution detection and signal-manifold dimension. Overall, the approach provides a rigorous mathematical foundation to study robustness, generalization in Reproducing Kernel Banach Spaces, and connections between distributional learning and neural networks, with practical implications for robustness and uncertainty quantification.
Abstract
This work starts with the intention of using mathematics to understand the intriguing vulnerability observed by ~\citet{szegedy2013} within artificial neural networks. Along the way, we will develop some novel tools with applications far outside of just the adversarial domain. We will do this while developing a rigorous mathematical framework to examine this problem. Our goal is to build out theory which can support increasingly sophisticated conjecture about adversarial attacks with a particular focus on the so called ``Dimpled Manifold Hypothesis'' by ~\citet{shamir2021dimpled}. Chapter one will cover the history and architecture of neural network architectures. Chapter two is focused on the background of adversarial vulnerability. Starting from the seminal paper by ~\citet{szegedy2013} we will develop the theory of adversarial perturbation and attack. Chapter three will build a theory of persistence that is related to Ricci Curvature, which can be used to measure properties of decision boundaries. We will use this foundation to make a conjecture relating adversarial attacks. Chapters four and five represent a sudden and wonderful digression that examines an intriguing related body of theory for spatial analysis of neural networks as approximations of kernel machines and becomes a novel theory for representing neural networks with bilinear maps. These heavily mathematical chapters will set up a framework and begin exploring applications of what may become a very important theoretical foundation for analyzing neural network learning with spatial and geometric information. We will conclude by setting up our new methods to address the conjecture from chapter 3 in continuing research.
