Table of Contents
Fetching ...

Encrypted Prompt: Securing LLM Applications Against Unauthorized Actions

Shih-Han Chan

TL;DR

This paper addresses the risk of prompt injection and API misuse in LLM-enabled agents by introducing an Encrypted Prompt that attaches to each user prompt. The Encrypted Prompt comprises a delimiter, a dynamic permission set, and a public key, enabling server-side verification to ensure only actions within the permitted scope are executed. It provides a flexible, application-level defense that can adapt permissions based on user, device, and server status, and can integrate with other safety approaches. While offering practical protection without retraining, it also notes limitations such as the need for on-device permission management and potential gaps for inherently authorized harmful actions within the permission model.

Abstract

Security threats like prompt injection attacks pose significant risks to applications that integrate Large Language Models (LLMs), potentially leading to unauthorized actions such as API misuse. Unlike previous approaches that aim to detect these attacks on a best-effort basis, this paper introduces a novel method that appends an Encrypted Prompt to each user prompt, embedding current permissions. These permissions are verified before executing any actions (such as API calls) generated by the LLM. If the permissions are insufficient, the LLM's actions will not be executed, ensuring safety. This approach guarantees that only actions within the scope of the current permissions from the LLM can proceed. In scenarios where adversarial prompts are introduced to mislead the LLM, this method ensures that any unauthorized actions from LLM wouldn't be executed by verifying permissions in Encrypted Prompt. Thus, threats like prompt injection attacks that trigger LLM to generate harmful actions can be effectively mitigated.

Encrypted Prompt: Securing LLM Applications Against Unauthorized Actions

TL;DR

This paper addresses the risk of prompt injection and API misuse in LLM-enabled agents by introducing an Encrypted Prompt that attaches to each user prompt. The Encrypted Prompt comprises a delimiter, a dynamic permission set, and a public key, enabling server-side verification to ensure only actions within the permitted scope are executed. It provides a flexible, application-level defense that can adapt permissions based on user, device, and server status, and can integrate with other safety approaches. While offering practical protection without retraining, it also notes limitations such as the need for on-device permission management and potential gaps for inherently authorized harmful actions within the permission model.

Abstract

Security threats like prompt injection attacks pose significant risks to applications that integrate Large Language Models (LLMs), potentially leading to unauthorized actions such as API misuse. Unlike previous approaches that aim to detect these attacks on a best-effort basis, this paper introduces a novel method that appends an Encrypted Prompt to each user prompt, embedding current permissions. These permissions are verified before executing any actions (such as API calls) generated by the LLM. If the permissions are insufficient, the LLM's actions will not be executed, ensuring safety. This approach guarantees that only actions within the scope of the current permissions from the LLM can proceed. In scenarios where adversarial prompts are introduced to mislead the LLM, this method ensures that any unauthorized actions from LLM wouldn't be executed by verifying permissions in Encrypted Prompt. Thus, threats like prompt injection attacks that trigger LLM to generate harmful actions can be effectively mitigated.

Paper Structure

This paper contains 10 sections, 2 equations, 4 figures.

Figures (4)

  • Figure 1: A simplified example illustrates how Encrypted Prompt work. The user submits a prompt from their device, which is appended with an encrypted prompt and sent to the server. The LLM generates API calls and responses based on the user's prompt. Before executing these API calls, the server checks the permissions specified in the encrypted prompt. If an API call exceeds the permissions, the server may reject the request or ask the user for additional verification. For example, (a) Delete_Email API (generated from adversarial prompt) exceeds the current permission level and is rejected, (b) whereas a Find_Photo API call is within the permitted scope and is executed.
  • Figure 2: Malicious User Scenario. The malicious user prompt contains adversarial texts and prompt designed to manipulate the LLM into generating harmful API calls. Send_Email is blocked as the API call exceeds the permissions in encrypted prompt. (Actual execution paths are highlighted.)
  • Figure 3: Malicious Content from Online Source. The prompt from online website includes adversarial texts and prompt, and the LLM generates API calls beyond current permission scope. In this example, the Web_Crawl API is executed to read online text as LLM input because the current permissions allow it. However, when the LLM generates a Send_Email API call from online texts, it is rejected since it exceeds the current permissions. (Actual execution paths are highlighted.)
  • Figure 4: Malicious LLMs Scenario. The LLM generates unexpected API calls even if the input user prompt is clean. In this example, Move_Data API generated from LLM is rejected since current permission is not enough. (Actual execution paths are highlighted.)