Table of Contents
Fetching ...

Machine Learning with Confidential Computing: A Systematization of Knowledge

Fan Mo, Zahra Tarkhani, Hamed Haddadi

TL;DR

This paper addresses the privacy and security challenges in machine learning by systematizing Confidential Computing approaches that rely on Trusted Execution Environments (TEEs). It analyzes threat models, hardware/software components, and partitioning frameworks, and surveys current ML confidentiality/integrity solutions, highlighting bottlenecks such as memory limits, accelerator support, and cross-TEE heterogeneity. The authors propose next-generation directions including formal privacy foundations, layer- or feature-map partitioning across heterogeneous TEEs, dedicated TEE designs for ML, and TEE-aware ML frameworks to reduce risk and overhead. Overall, the work bridges ML and Confidential Computing, offering a roadmap for stronger privacy guarantees in ML pipelines without prohibitive costs and guiding future hardware, OS, and framework development.

Abstract

Privacy and security challenges in Machine Learning (ML) have become increasingly severe, along with ML's pervasive development and the recent demonstration of large attack surfaces. As a mature system-oriented approach, Confidential Computing has been utilized in both academia and industry to mitigate privacy and security issues in various ML scenarios. In this paper, the conjunction between ML and Confidential Computing is investigated. We systematize the prior work on Confidential Computing-assisted ML techniques that provide i) confidentiality guarantees and ii) integrity assurances, and discuss their advanced features and drawbacks. Key challenges are further identified, and we provide dedicated analyses of the limitations in existing Trusted Execution Environment (TEE) systems for ML use cases. Finally, prospective works are discussed, including grounded privacy definitions for closed-loop protection, partitioned executions of efficient ML, dedicated TEE-assisted designs for ML, TEE-aware ML, and ML full pipeline guarantees. By providing these potential solutions in our systematization of knowledge, we aim to build the bridge to help achieve a much stronger TEE-enabled ML for privacy guarantees without introducing computation and system costs.

Machine Learning with Confidential Computing: A Systematization of Knowledge

TL;DR

This paper addresses the privacy and security challenges in machine learning by systematizing Confidential Computing approaches that rely on Trusted Execution Environments (TEEs). It analyzes threat models, hardware/software components, and partitioning frameworks, and surveys current ML confidentiality/integrity solutions, highlighting bottlenecks such as memory limits, accelerator support, and cross-TEE heterogeneity. The authors propose next-generation directions including formal privacy foundations, layer- or feature-map partitioning across heterogeneous TEEs, dedicated TEE designs for ML, and TEE-aware ML frameworks to reduce risk and overhead. Overall, the work bridges ML and Confidential Computing, offering a roadmap for stronger privacy guarantees in ML pipelines without prohibitive costs and guiding future hardware, OS, and framework development.

Abstract

Privacy and security challenges in Machine Learning (ML) have become increasingly severe, along with ML's pervasive development and the recent demonstration of large attack surfaces. As a mature system-oriented approach, Confidential Computing has been utilized in both academia and industry to mitigate privacy and security issues in various ML scenarios. In this paper, the conjunction between ML and Confidential Computing is investigated. We systematize the prior work on Confidential Computing-assisted ML techniques that provide i) confidentiality guarantees and ii) integrity assurances, and discuss their advanced features and drawbacks. Key challenges are further identified, and we provide dedicated analyses of the limitations in existing Trusted Execution Environment (TEE) systems for ML use cases. Finally, prospective works are discussed, including grounded privacy definitions for closed-loop protection, partitioned executions of efficient ML, dedicated TEE-assisted designs for ML, TEE-aware ML, and ML full pipeline guarantees. By providing these potential solutions in our systematization of knowledge, we aim to build the bridge to help achieve a much stronger TEE-enabled ML for privacy guarantees without introducing computation and system costs.
Paper Structure (29 sections, 6 figures, 1 table)

This paper contains 29 sections, 6 figures, 1 table.

Figures (6)

  • Figure 1: The schematic diagram of the machine learning pipeline and resided attack surfaces across the pipeline. 'I' refers to Integrity-related attacks. 'P' refers to Privacy-related attacks. 'W' refers to that one attack is generally in white-box form, while 'B' refers to black-box form.
  • Figure 2: The schematic diagram of Confidential Computing by utilizing the Trusted Execution Environment (TEE).
  • Figure 3: Overview flow of Confidential Computing which utilizes the untrusted host's Trusted Execution Environment to protect the model and data in machine learning.
  • Figure 4: Server-side ML (Left) and client-side ML (Right) protection using Trusted Execution Environments. Note that, in relation to the ML paradigm defined in Section \ref{['sec:ml_paradigm']}, most centralized ML is server-side, and most distributed ML is client-side, depending on how the server and the client are defined under specific ML scenarios.
  • Figure 5: Partition on the ML process (shown with model architectures) in order to protect it or a part of it inside TEEs.
  • ...and 1 more figures