CoreGuard: Safeguarding Foundational Capabilities of LLMs Against Model Stealing in Edge Deployment
Qinfeng Li, Tianyue Luo, Xuhong Zhang, Yangfan Xie, Zhiqiang Shen, Lijun Zhang, Yier Jin, Hao Peng, Xinkui Zhao, Xianwei Zhu, Jianwei Yin
TL;DR
CoreGuard tackles the problem of safeguarding foundational capabilities of edge-deployed LLMs from model stealing and misuse. It introduces a two-phase framework: model locking via row-permutation of input-processing layers and an inference authorization protocol that uses OTP encryption and a propagation mechanism to limit TEE interactions to a single initial authorization. The approach yields strong security—blocking unauthorized usage and nearly matching upper-bound defenses—while incurring negligible computation and communication overhead and preserving accuracy across multiple tasks and models. Practically, CoreGuard enables secure on-device LLMs suitable for privacy-sensitive and latency-constrained environments, with broad architectural compatibility and minimal impact on performance. The work also discusses limitations and integration with broader defense strategies such as side-channel mitigations for TEEs.
Abstract
Proprietary large language models (LLMs) exhibit strong generalization capabilities across diverse tasks and are increasingly deployed on edge devices for efficiency and privacy reasons. However, deploying proprietary LLMs at the edge without adequate protection introduces critical security threats. Attackers can extract model weights and architectures, enabling unauthorized copying and misuse. Even when protective measures prevent full extraction of model weights, attackers may still perform advanced attacks, such as fine-tuning, to further exploit the model. Existing defenses against these threats typically incur significant computational and communication overhead, making them impractical for edge deployment. To safeguard the edge-deployed LLMs, we introduce CoreGuard, a computation- and communication-efficient protection method. CoreGuard employs an efficient protection protocol to reduce computational overhead and minimize communication overhead via a propagation protocol. Extensive experiments show that CoreGuard achieves upper-bound security protection with negligible overhead.
