Table of Contents
Fetching ...

Complexity Matters: Effective Dimensionality as a Measure for Adversarial Robustness

David Khachaturov, Robert Mullins

TL;DR

The relationship between a model's effective dimensionality, which can be thought of as model complexity, and its robustness properties is investigated, revealing a near-linear inverse relationship between effective dimensionality and adversarial robustness, that is models with a lower dimensionality exhibit better robustness.

Abstract

Quantifying robustness in a single measure for the purposes of model selection, development of adversarial training methods, and anticipating trends has so far been elusive. The simplest metric to consider is the number of trainable parameters in a model but this has previously been shown to be insufficient at explaining robustness properties. A variety of other metrics, such as ones based on boundary thickness and gradient flatness have been proposed but have been shown to be inadequate proxies for robustness. In this work, we investigate the relationship between a model's effective dimensionality, which can be thought of as model complexity, and its robustness properties. We run experiments on commercial-scale models that are often used in real-world environments such as YOLO and ResNet. We reveal a near-linear inverse relationship between effective dimensionality and adversarial robustness, that is models with a lower dimensionality exhibit better robustness. We investigate the effect of a variety of adversarial training methods on effective dimensionality and find the same inverse linear relationship present, suggesting that effective dimensionality can serve as a useful criterion for model selection and robustness evaluation, providing a more nuanced and effective metric than parameter count or previously-tested measures.

Complexity Matters: Effective Dimensionality as a Measure for Adversarial Robustness

TL;DR

The relationship between a model's effective dimensionality, which can be thought of as model complexity, and its robustness properties is investigated, revealing a near-linear inverse relationship between effective dimensionality and adversarial robustness, that is models with a lower dimensionality exhibit better robustness.

Abstract

Quantifying robustness in a single measure for the purposes of model selection, development of adversarial training methods, and anticipating trends has so far been elusive. The simplest metric to consider is the number of trainable parameters in a model but this has previously been shown to be insufficient at explaining robustness properties. A variety of other metrics, such as ones based on boundary thickness and gradient flatness have been proposed but have been shown to be inadequate proxies for robustness. In this work, we investigate the relationship between a model's effective dimensionality, which can be thought of as model complexity, and its robustness properties. We run experiments on commercial-scale models that are often used in real-world environments such as YOLO and ResNet. We reveal a near-linear inverse relationship between effective dimensionality and adversarial robustness, that is models with a lower dimensionality exhibit better robustness. We investigate the effect of a variety of adversarial training methods on effective dimensionality and find the same inverse linear relationship present, suggesting that effective dimensionality can serve as a useful criterion for model selection and robustness evaluation, providing a more nuanced and effective metric than parameter count or previously-tested measures.

Paper Structure

This paper contains 17 sections, 3 equations, 5 figures.

Figures (5)

  • Figure 1: Measuring the effect of scale on a model's effective dimensionality, on various datasets. The exact models tested from each model family are listed in \ref{['sec:experiments']}.
  • Figure 2: Relative adversarial performance, plotted against the respective model's effective dimensionality. A description of the performance metric is given in \ref{['sec:experiments']}. We report the top-5 accuracy for AutoAttack and the top-1 accuracy for PGD and GN.
  • Figure 3: Effect of various adversarial training methods, described in \ref{['sec:experiments']}, on the effective dimensionality of the respective models. AWP corresponds to Adversarial Weight Perturbation, and AWP+ED involves AWP and extra training data.
  • Figure 4: Relative adversarial performance under a various adversarial training methods, plotted against the respective model's effective dimensionality. A description of the performance metric is given in \ref{['sec:experiments']}. AWP corresponds to Adversarial Weight Perturbation, and AWP+ED involves AWP and extra training data.
  • Figure 5: Relative adversarial performance, plotted against the respective model's size, measured in number of trainable parameters. A description of the performance metric is given in \ref{['sec:experiments']}. We report the top-5 accuracy for AutoAttack and the top-1 accuracy for PGD and GN.