Table of Contents
Fetching ...

Fortify Your Foundations: Practical Privacy and Security for Foundation Model Deployments In The Cloud

Marcin Chrapek, Anjo Vahldiek-Oberwagner, Marcin Spoczynski, Scott Constable, Mona Vij, Torsten Hoefler

TL;DR

This work examines the FM threat model and discusses the practicality and comprehensiveness of various approaches for securing against them, such as ML-based methods and trusted execution environments (TEEs), and demonstrates that TEEs offer an effective balance between strong security properties, usability, and performance.

Abstract

Foundation Models (FMs) display exceptional performance in tasks such as natural language processing and are being applied across a growing range of disciplines. Although typically trained on large public datasets, FMs are often fine-tuned or integrated into Retrieval-Augmented Generation (RAG) systems, which rely on private data. This access, along with their size and costly training, heightens the risk of intellectual property theft. Moreover, multimodal FMs may expose sensitive information. In this work, we examine the FM threat model and discuss the practicality and comprehensiveness of various approaches for securing against them, such as ML-based methods and trusted execution environments (TEEs). We demonstrate that TEEs offer an effective balance between strong security properties, usability, and performance. Specifically, we present a solution achieving less than 10\% overhead versus bare metal for the full Llama2 7B and 13B inference pipelines running inside \intel\ SGX and \intel\ TDX. We also share our configuration files and insights from our implementation. To our knowledge, our work is the first to show the practicality of TEEs for securing FMs.

Fortify Your Foundations: Practical Privacy and Security for Foundation Model Deployments In The Cloud

TL;DR

This work examines the FM threat model and discusses the practicality and comprehensiveness of various approaches for securing against them, such as ML-based methods and trusted execution environments (TEEs), and demonstrates that TEEs offer an effective balance between strong security properties, usability, and performance.

Abstract

Foundation Models (FMs) display exceptional performance in tasks such as natural language processing and are being applied across a growing range of disciplines. Although typically trained on large public datasets, FMs are often fine-tuned or integrated into Retrieval-Augmented Generation (RAG) systems, which rely on private data. This access, along with their size and costly training, heightens the risk of intellectual property theft. Moreover, multimodal FMs may expose sensitive information. In this work, we examine the FM threat model and discuss the practicality and comprehensiveness of various approaches for securing against them, such as ML-based methods and trusted execution environments (TEEs). We demonstrate that TEEs offer an effective balance between strong security properties, usability, and performance. Specifically, we present a solution achieving less than 10\% overhead versus bare metal for the full Llama2 7B and 13B inference pipelines running inside \intel\ SGX and \intel\ TDX. We also share our configuration files and insights from our implementation. To our knowledge, our work is the first to show the practicality of TEEs for securing FMs.
Paper Structure (19 sections, 5 figures)

This paper contains 19 sections, 5 figures.

Figures (5)

  • Figure 1: Examples of the types of threats our approach leveraging TEEs protects actively against. We also show our example performance results for baseline inference of Llama2 7B INT8 in two TEE implementations, a Virtual Machine (VM) and an application-based one.
  • Figure 2: An overview of the threats and adversaries that exist when offloading FM deployments to the cloud, together with representative examples.
  • Figure 3: An overview of a flow to secure FMs relying on properties of TEEs. Here, we assume a secure enclave on an untrusted host operated by the CSP and running some kind of an OS that supports its security features. Green lines show communication channels that are protected using confidentiality and integrity in some way (e.g., TLS or encrypted storage). Black lines are unprotected.
  • Figure 4: A differentiation between TDX and SGX with an extract from our Gramine manifest template file containing all the information needed to provide security to a Gramine-based TEE running an FM workload.
  • Figure 5: The TEE generation speed reductions and latency overheads are within 4-10% for TDX and SGX.