Machine Learning with Confidential Computing: A Systematization of Knowledge
Fan Mo, Zahra Tarkhani, Hamed Haddadi
TL;DR
This paper addresses the privacy and security challenges in machine learning by systematizing Confidential Computing approaches that rely on Trusted Execution Environments (TEEs). It analyzes threat models, hardware/software components, and partitioning frameworks, and surveys current ML confidentiality/integrity solutions, highlighting bottlenecks such as memory limits, accelerator support, and cross-TEE heterogeneity. The authors propose next-generation directions including formal privacy foundations, layer- or feature-map partitioning across heterogeneous TEEs, dedicated TEE designs for ML, and TEE-aware ML frameworks to reduce risk and overhead. Overall, the work bridges ML and Confidential Computing, offering a roadmap for stronger privacy guarantees in ML pipelines without prohibitive costs and guiding future hardware, OS, and framework development.
Abstract
Privacy and security challenges in Machine Learning (ML) have become increasingly severe, along with ML's pervasive development and the recent demonstration of large attack surfaces. As a mature system-oriented approach, Confidential Computing has been utilized in both academia and industry to mitigate privacy and security issues in various ML scenarios. In this paper, the conjunction between ML and Confidential Computing is investigated. We systematize the prior work on Confidential Computing-assisted ML techniques that provide i) confidentiality guarantees and ii) integrity assurances, and discuss their advanced features and drawbacks. Key challenges are further identified, and we provide dedicated analyses of the limitations in existing Trusted Execution Environment (TEE) systems for ML use cases. Finally, prospective works are discussed, including grounded privacy definitions for closed-loop protection, partitioned executions of efficient ML, dedicated TEE-assisted designs for ML, TEE-aware ML, and ML full pipeline guarantees. By providing these potential solutions in our systematization of knowledge, we aim to build the bridge to help achieve a much stronger TEE-enabled ML for privacy guarantees without introducing computation and system costs.
