GuaranTEE: Towards Attestable and Private ML with CCA
Sandra Siby, Sina Abdollahi, Mohammad Maheri, Marios Kogias, Hamed Haddadi
TL;DR
The paper addresses the challenge of private and auditable ML deployment on edge devices by introducing GuaranTEE, a framework that runs provider models inside Arm's Confidential Computing Architecture (CCA) realms. It details a threat model and a multi-step pipeline for provisioning, attesting, and running models within a realm, and implements a prototype on Arm's Fixed Virtual Platforms to assess feasibility. Preliminary results show a roughly 1.7x instruction overhead for realm-based inference and substantial setup costs, driven by realm creation and memory provisioning, with attestation limitations tied to current hardware simulators. The work highlights practical constraints and outlines architectural and ecosystem improvements needed to realize a fully private, attestable edge ML deployment, including enhanced attestation, secure I/O for inputs/outputs, per-realm policy enforcement, and better availability guarantees.
Abstract
Machine-learning (ML) models are increasingly being deployed on edge devices to provide a variety of services. However, their deployment is accompanied by challenges in model privacy and auditability. Model providers want to ensure that (i) their proprietary models are not exposed to third parties; and (ii) be able to get attestations that their genuine models are operating on edge devices in accordance with the service agreement with the user. Existing measures to address these challenges have been hindered by issues such as high overheads and limited capability (processing/secure memory) on edge devices. In this work, we propose GuaranTEE, a framework to provide attestable private machine learning on the edge. GuaranTEE uses Confidential Computing Architecture (CCA), Arm's latest architectural extension that allows for the creation and deployment of dynamic Trusted Execution Environments (TEEs) within which models can be executed. We evaluate CCA's feasibility to deploy ML models by developing, evaluating, and openly releasing a prototype. We also suggest improvements to CCA to facilitate its use in protecting the entire ML deployment pipeline on edge devices.
