Table of Contents
Fetching ...

Towards more Practical Threat Models in Artificial Intelligence Security

Kathrin Grosse, Lukas Bieringer, Tarek Richard Besold, Alexandre Alahi

TL;DR

This paper revisit the threat models of the six most studied attacks in AI security research and match them to AI usage in practice via a survey with 271 industrial practitioners, finding that all existing threat models are indeed applicable.

Abstract

Recent works have identified a gap between research and practice in artificial intelligence security: threats studied in academia do not always reflect the practical use and security risks of AI. For example, while models are often studied in isolation, they form part of larger ML pipelines in practice. Recent works also brought forward that adversarial manipulations introduced by academic attacks are impractical. We take a first step towards describing the full extent of this disparity. To this end, we revisit the threat models of the six most studied attacks in AI security research and match them to AI usage in practice via a survey with 271 industrial practitioners. On the one hand, we find that all existing threat models are indeed applicable. On the other hand, there are significant mismatches: research is often too generous with the attacker, assuming access to information not frequently available in real-world settings. Our paper is thus a call for action to study more practical threat models in artificial intelligence security.

Towards more Practical Threat Models in Artificial Intelligence Security

TL;DR

This paper revisit the threat models of the six most studied attacks in AI security research and match them to AI usage in practice via a survey with 271 industrial practitioners, finding that all existing threat models are indeed applicable.

Abstract

Recent works have identified a gap between research and practice in artificial intelligence security: threats studied in academia do not always reflect the practical use and security risks of AI. For example, while models are often studied in isolation, they form part of larger ML pipelines in practice. Recent works also brought forward that adversarial manipulations introduced by academic attacks are impractical. We take a first step towards describing the full extent of this disparity. To this end, we revisit the threat models of the six most studied attacks in AI security research and match them to AI usage in practice via a survey with 271 industrial practitioners. On the one hand, we find that all existing threat models are indeed applicable. On the other hand, there are significant mismatches: research is often too generous with the attacker, assuming access to information not frequently available in real-world settings. Our paper is thus a call for action to study more practical threat models in artificial intelligence security.
Paper Structure (25 sections, 4 figures, 8 tables)

This paper contains 25 sections, 4 figures, 8 tables.

Figures (4)

  • Figure 1: Backdoor threat model in percent of our participants' replies. We report 3rd party access: White denotes incomplete data or an irrelevant threat model (e.g., only test data accessible). Black represents no access, turquoise the backdoor threat model. Light turquoise denotes insufficient access for backdoors, but sufficient access for poisoning attacks.
  • Figure 2: Evasion threat models in percent of our participants' replies. We report 3rd party access: White denotes incomplete data or an irrelevant threat model (e.g., only model accessible). Black represents no access, turquoise white-box and light turquoise black-box evasion threat models.
  • Figure 3: Model stealing threat model in percent of our participants' replies. We describe 3rd party access: White denotes incomplete data or an irrelevant threat model (e.g., only test inputs are accessible). Black represents no access, turquoise denotes the academic threat model, gray that the attack is obsolete as the model is available. Red denotes a rarely studied threat model in current research.
  • Figure 4: Membership and attribute inference threat models in percent of participants' replies. We describe 3rd party access: White denotes incomplete data or irrelevant threat models, black represents no access, turquoise denotes existing threat models, gray means that the attack is obsolete as the training data is available, too. For membership, red denotes a threat model not studied so far. In the case of attribute inference, turquoise denotes no model access, but the property can directly be inferred from the training data.