An Early Experience with Confidential Computing Architecture for On-Device Model Protection
Sina Abdollahi, Mohammad Maheri, Sandra Siby, Marios Kogias, Hamed Haddadi
TL;DR
The paper investigates Arm Confidential Compute Architecture (CCA) as a framework for protecting on-device ML models by running them inside realm VMs. It defines a deployment framework, analyzes overhead sources, and demonstrates privacy benefits via a membership inference attack, reporting up to 22% inference overhead and an 8.3% reduction in attack success. The evaluation uses hardware emulation (FVP) and attenuation through attestation-enabled realm execution, confirming the viability of confidential on-device inference while releasing code for early adoption. Limitations include reliance on emulation rather than real hardware and the need for hardware support for extensive accelerator integration and full end-to-end privacy guarantees.
Abstract
Deploying machine learning (ML) models on user devices can improve privacy (by keeping data local) and reduce inference latency. Trusted Execution Environments (TEEs) are a practical solution for protecting proprietary models, yet existing TEE solutions have architectural constraints that hinder on-device model deployment. Arm Confidential Computing Architecture (CCA), a new Arm extension, addresses several of these limitations and shows promise as a secure platform for on-device ML. In this paper, we evaluate the performance-privacy trade-offs of deploying models within CCA, highlighting its potential to enable confidential and efficient ML applications. Our evaluations show that CCA can achieve an overhead of, at most, 22% in running models of different sizes and applications, including image classification, voice recognition, and chat assistants. This performance overhead comes with privacy benefits; for example, our framework can successfully protect the model against membership inference attack by an 8.3% reduction in the adversary's success rate. To support further research and early adoption, we make our code and methodology publicly available.
