Table of Contents
Fetching ...

Shapes are not enough: CONSERVAttack and its use for finding vulnerabilities and uncertainties in machine learning applications

Philip Bechtle, Lucie Flek, Philipp Alexander Jung, Akbar Karimi, Timo Saala, Alexander Schmidt, Matthias Schott, Philipp Soldin, Christopher Wiebusch, Ulrich Willemsen

Abstract

In High Energy Physics, as in many other fields of science, the application of machine learning techniques has been crucial in advancing our understanding of fundamental phenomena. Increasingly, deep learning models are applied to analyze both simulated and experimental data. In most experiments, a rigorous regime of testing for physically motivated systematic uncertainties is in place. The numerical evaluation of these tests for differences between the data on the one side and simulations on the other side quantifies the effect of potential sources of mismodelling on the machine learning output. In addition, thorough comparisons of marginal distributions and (linear) feature correlations between data and simulation in "control regions" are applied. However, the guidance by physical motivation, and the need to constrain comparisons to specific regions, does not guarantee that all possible sources of deviations have been accounted for. We therefore propose a new adversarial attack - the CONSERVAttack - designed to exploit the remaining space of hypothetical deviations between simulation and data after the above mentioned tests. The resulting adversarial perturbations are consistent within the uncertainty bounds - evading standard validation checks - while successfully fooling the underlying model. We further propose strategies to mitigate such vulnerabilities and argue that robustness to adversarial effects must be considered when interpreting results from deep learning in particle physics.

Shapes are not enough: CONSERVAttack and its use for finding vulnerabilities and uncertainties in machine learning applications

Abstract

In High Energy Physics, as in many other fields of science, the application of machine learning techniques has been crucial in advancing our understanding of fundamental phenomena. Increasingly, deep learning models are applied to analyze both simulated and experimental data. In most experiments, a rigorous regime of testing for physically motivated systematic uncertainties is in place. The numerical evaluation of these tests for differences between the data on the one side and simulations on the other side quantifies the effect of potential sources of mismodelling on the machine learning output. In addition, thorough comparisons of marginal distributions and (linear) feature correlations between data and simulation in "control regions" are applied. However, the guidance by physical motivation, and the need to constrain comparisons to specific regions, does not guarantee that all possible sources of deviations have been accounted for. We therefore propose a new adversarial attack - the CONSERVAttack - designed to exploit the remaining space of hypothetical deviations between simulation and data after the above mentioned tests. The resulting adversarial perturbations are consistent within the uncertainty bounds - evading standard validation checks - while successfully fooling the underlying model. We further propose strategies to mitigate such vulnerabilities and argue that robustness to adversarial effects must be considered when interpreting results from deep learning in particle physics.
Paper Structure (29 sections, 13 equations, 32 figures, 22 tables)

This paper contains 29 sections, 13 equations, 32 figures, 22 tables.

Figures (32)

  • Figure 1: Comparison of three distinct distributions between clean and adversarial events in an attack setting. The plot on the left shows the feature resulting in the minimal change in Jensen Shannon distance, the one in the middle the feature corresponding to the median Jensen Shannon Distance, and the plot on the right the feature exhibiting the largest change in the Jensen Shannon Distance for the given attack run.
  • Figure 2: Comparison of the correlation matrices between the clean events (left) and adversarial events (right) resulting from the attack.
  • Figure 3: Fooling ratio of the adversarial attack on the Higgs dataset over 10 runs. The average performance of the attack is shown as a dashed line, and its standard deviation as a dotted line.
  • Figure 4: Average Jensen Shannon distance of the adversarial attack on the Higgs dataset over 10 runs. The average performance of the attack is shown as a dashed line, and its standard deviation as a dotted line.
  • Figure 5: Average Frobenius Norm of the adversarial attack on the Higgs dataset over 10 runs. The average performance of the attack is shown as a dashed line, and its standard deviation as a dotted line.
  • ...and 27 more figures