Table of Contents
Fetching ...

A Survey on Private Transformer Inference

Yang Li, Xinyu Zhou, Yitong Wang, Liangxin Qian, Jun Zhao

TL;DR

This survey addresses the privacy challenges of transformer inference in MLaaS by compiling recent advances in Private Transformer Inference (PTI) that rely on Secure Multi-Party Computation (MPC) and Homomorphic Encryption (HE). It systematically analyzes PTI studies (2022–2024) across cryptographic setups (2PC, 2PC-Dealer, 3PC), identifying bottlenecks in large MatMul operations and nonlinear layers such as Softmax and GeLU. The authors categorize methods by their treatment of linear and nonlinear layers, compare SSC-based and HE-based MatMul protocols, and evaluate trade-offs in communication, computation, and privacy guarantees, offering evaluation guidelines. The paper provides a critical synthesis of strengths and weaknesses, highlights practical deployment considerations (e.g., trusted dealers, honest-majority assumptions), and proposes concrete avenues for improving efficiency and privacy in PTI. Overall, the work serves as a comprehensive reference for researchers and practitioners aiming to bridge high-performance transformer inference with rigorous data privacy in a practical MLaaS context.

Abstract

Transformer models have revolutionized AI, enabling applications like content generation and sentiment analysis. However, their use in Machine Learning as a Service (MLaaS) raises significant privacy concerns, as centralized servers process sensitive user data. Private Transformer Inference (PTI) addresses these issues using cryptographic techniques such as Secure Multi-Party Computation (MPC) and Homomorphic Encryption (HE), enabling secure model inference without exposing inputs or models. This paper reviews recent advancements in PTI, analyzing state-of-the-art solutions, their challenges, and potential improvements. We also propose evaluation guidelines to assess resource efficiency and privacy guarantees, aiming to bridge the gap between high-performance inference and data privacy.

A Survey on Private Transformer Inference

TL;DR

This survey addresses the privacy challenges of transformer inference in MLaaS by compiling recent advances in Private Transformer Inference (PTI) that rely on Secure Multi-Party Computation (MPC) and Homomorphic Encryption (HE). It systematically analyzes PTI studies (2022–2024) across cryptographic setups (2PC, 2PC-Dealer, 3PC), identifying bottlenecks in large MatMul operations and nonlinear layers such as Softmax and GeLU. The authors categorize methods by their treatment of linear and nonlinear layers, compare SSC-based and HE-based MatMul protocols, and evaluate trade-offs in communication, computation, and privacy guarantees, offering evaluation guidelines. The paper provides a critical synthesis of strengths and weaknesses, highlights practical deployment considerations (e.g., trusted dealers, honest-majority assumptions), and proposes concrete avenues for improving efficiency and privacy in PTI. Overall, the work serves as a comprehensive reference for researchers and practitioners aiming to bridge high-performance transformer inference with rigorous data privacy in a practical MLaaS context.

Abstract

Transformer models have revolutionized AI, enabling applications like content generation and sentiment analysis. However, their use in Machine Learning as a Service (MLaaS) raises significant privacy concerns, as centralized servers process sensitive user data. Private Transformer Inference (PTI) addresses these issues using cryptographic techniques such as Secure Multi-Party Computation (MPC) and Homomorphic Encryption (HE), enabling secure model inference without exposing inputs or models. This paper reviews recent advancements in PTI, analyzing state-of-the-art solutions, their challenges, and potential improvements. We also propose evaluation guidelines to assess resource efficiency and privacy guarantees, aiming to bridge the gap between high-performance inference and data privacy.

Paper Structure

This paper contains 25 sections, 26 equations, 1 figure, 13 tables.

Figures (1)

  • Figure 1: Structure and workflow of a Transformer zhang2024secure.

Theorems & Definitions (1)

  • Definition 1