Table of Contents
Fetching ...

Generative AI Security: Challenges and Countermeasures

Banghua Zhu, Norman Mu, Jiantao Jiao, David Wagner

TL;DR

This paper delves into the unique security challenges posed by Generative AI, and outlines potential research directions for managing these risks.

Abstract

Generative AI's expanding footprint across numerous industries has led to both excitement and increased scrutiny. This paper delves into the unique security challenges posed by Generative AI, and outlines potential research directions for managing these risks.

Generative AI Security: Challenges and Countermeasures

TL;DR

This paper delves into the unique security challenges posed by Generative AI, and outlines potential research directions for managing these risks.

Abstract

Generative AI's expanding footprint across numerous industries has led to both excitement and increased scrutiny. This paper delves into the unique security challenges posed by Generative AI, and outlines potential research directions for managing these risks.
Paper Structure (28 sections, 4 figures)

This paper contains 28 sections, 4 figures.

Figures (4)

  • Figure 1: An example of a prompt injection attack on Bing Chat. An injection prompt is hidden in the website content, leading to undesired behavior of the chat model. In this example, the attack causes the model to start outputting emojis, but it could also have more serious consequences, such as outputting disinformation or abusive content.
  • Figure 2: Rule-based defenses can be easily defeated.
  • Figure 3: An AI Firewall, built by apply a moderation model to LLM inputs and outputs.
  • Figure 4: An integrated firewall can use visibility into the model to detect more attacks.