Table of Contents
Fetching ...

Slalom: Fast, Verifiable and Private Execution of Neural Networks in Trusted Hardware

Florian Tramèr, Dan Boneh

TL;DR

The paper tackles the efficiency gap in secure ML outsourcing by splitting DNN inference: keep nonlinear, control-sensitive components inside a trusted enclave while outsourcing the heavy linear layers to a fast, untrusted co-processor. It introduces Slalom, which uses quantization, Freivalds' algorithm for integrity, and a lightweight input-blinding scheme to achieve verifiable and private inference with substantial throughput gains on canonical models (e.g., VGG16, MobileNet, ResNet). The authors provide formal security arguments, implement a SGX-based DNN library, and demonstrate 6×–20× improvements for verifiable inference and 4×–11× for verifiable and private inference compared to running entirely in the TEE. They also discuss the challenges and directions for verifiable/private training, and show that Slalom scales favorably with model size, paving the way for practical, secure ML in TEEs and co-located accelerators.

Abstract

As Machine Learning (ML) gets applied to security-critical or sensitive domains, there is a growing need for integrity and privacy for outsourced ML computations. A pragmatic solution comes from Trusted Execution Environments (TEEs), which use hardware and software protections to isolate sensitive computations from the untrusted software stack. However, these isolation guarantees come at a price in performance, compared to untrusted alternatives. This paper initiates the study of high performance execution of Deep Neural Networks (DNNs) in TEEs by efficiently partitioning DNN computations between trusted and untrusted devices. Building upon an efficient outsourcing scheme for matrix multiplication, we propose Slalom, a framework that securely delegates execution of all linear layers in a DNN from a TEE (e.g., Intel SGX or Sanctum) to a faster, yet untrusted, co-located processor. We evaluate Slalom by running DNNs in an Intel SGX enclave, which selectively delegates work to an untrusted GPU. For canonical DNNs (VGG16, MobileNet and ResNet variants) we obtain 6x to 20x increases in throughput for verifiable inference, and 4x to 11x for verifiable and private inference.

Slalom: Fast, Verifiable and Private Execution of Neural Networks in Trusted Hardware

TL;DR

The paper tackles the efficiency gap in secure ML outsourcing by splitting DNN inference: keep nonlinear, control-sensitive components inside a trusted enclave while outsourcing the heavy linear layers to a fast, untrusted co-processor. It introduces Slalom, which uses quantization, Freivalds' algorithm for integrity, and a lightweight input-blinding scheme to achieve verifiable and private inference with substantial throughput gains on canonical models (e.g., VGG16, MobileNet, ResNet). The authors provide formal security arguments, implement a SGX-based DNN library, and demonstrate 6×–20× improvements for verifiable inference and 4×–11× for verifiable and private inference compared to running entirely in the TEE. They also discuss the challenges and directions for verifiable/private training, and show that Slalom scales favorably with model size, paving the way for practical, secure ML in TEEs and co-located accelerators.

Abstract

As Machine Learning (ML) gets applied to security-critical or sensitive domains, there is a growing need for integrity and privacy for outsourced ML computations. A pragmatic solution comes from Trusted Execution Environments (TEEs), which use hardware and software protections to isolate sensitive computations from the untrusted software stack. However, these isolation guarantees come at a price in performance, compared to untrusted alternatives. This paper initiates the study of high performance execution of Deep Neural Networks (DNNs) in TEEs by efficiently partitioning DNN computations between trusted and untrusted devices. Building upon an efficient outsourcing scheme for matrix multiplication, we propose Slalom, a framework that securely delegates execution of all linear layers in a DNN from a TEE (e.g., Intel SGX or Sanctum) to a faster, yet untrusted, co-located processor. We evaluate Slalom by running DNNs in an Intel SGX enclave, which selectively delegates work to an untrusted GPU. For canonical DNNs (VGG16, MobileNet and ResNet variants) we obtain 6x to 20x increases in throughput for verifiable inference, and 4x to 11x for verifiable and private inference.

Paper Structure

This paper contains 37 sections, 4 theorems, 7 figures, 3 tables.

Key Result

Lemma 2.1

Let $A, B$ and $C$ be $n\times n$ matrices over a field $\mathbb{F}$ and let $s$ be a uniformly random vector in $\mathbb{S}^n$, for $\mathbb{S}\subseteq \mathbb{F}$. Then, $\Pr[C s = A (B s) \mid C \neq A B] = \Pr[(C-AB) s = \mathbf{0} \mid (C-AB) \neq \mathbf{0}] \leq 1/\abs{\mathbb{S}} \,.$

Figures (7)

  • Figure 1: The Slalom algorithms for verifiable and private DNN inference. The TEE outsources computation of $n$ linear layers of a model $F$ to the untrusted host server $\mathcal{S}$. Each linear layer is defined by a matrix $W_i$ of size $m_i \times n_i$ and followed by an activation $\sigma$. All operations are over a field $\mathbb{F}$. The $\mathsf{Freivalds}(y_i, x_i, w_i)$ subroutine performs $k$ repetitions of Freivalds' check (possibly using precomputed values as in Section \ref{['ssec:verif-linear']}). The pseudorandom elements $r_i$ (we omit the PRNG for simplicity) and precomputed values $u_i$ are used only once.
  • Figure 2: Micro benchmarks on Intel SGX. We plot the relative speedup of verifying the result of a linear operator compared to computing it entirely in the enclave. The dotted line shows the throughput obtained for a direct computation. "Fused" separable convolutions contain no intermediate activation.
  • Figure 3: Verifiable and private inference with Intel SGX. We show results for VGG16, VGG16 without the fully connected layers, MobileNet, and a fused MobileNet variant with no intermediate activation for separable convolutions. We compare the baseline of fully executing the DNN in the enclave (blue) to different secure outsourcing schemes: integrity with Freivalds (red); integrity with Freivalds and precomputed secrets (yellow); privacy only (black); privacy and integrity (purple).
  • Figure 4: Secure outsourcing of ResNet models with Intel SGX. We compare the baseline of fully executing the DNN in the enclave (blue) to secure outsourcing with integrity (yellow) and privacy and integrity (purple).
  • Figure 5: Micro benchmarks on an untrusted CPU. For three different linear operators, we plot the relative speedup of verifying a result compared to computing it. The dotted line in each plot shows the throughput obtained for computing the operation.
  • ...and 2 more figures

Theorems & Definitions (5)

  • Lemma 2.1: Freivalds
  • Lemma 3.1
  • Theorem 3.2
  • Corollary 3.3
  • Definition B.1: Secure Outsourcing Schemes