Table of Contents
Fetching ...

CoreGuard: Safeguarding Foundational Capabilities of LLMs Against Model Stealing in Edge Deployment

Qinfeng Li, Tianyue Luo, Xuhong Zhang, Yangfan Xie, Zhiqiang Shen, Lijun Zhang, Yier Jin, Hao Peng, Xinkui Zhao, Xianwei Zhu, Jianwei Yin

TL;DR

CoreGuard tackles the problem of safeguarding foundational capabilities of edge-deployed LLMs from model stealing and misuse. It introduces a two-phase framework: model locking via row-permutation of input-processing layers and an inference authorization protocol that uses OTP encryption and a propagation mechanism to limit TEE interactions to a single initial authorization. The approach yields strong security—blocking unauthorized usage and nearly matching upper-bound defenses—while incurring negligible computation and communication overhead and preserving accuracy across multiple tasks and models. Practically, CoreGuard enables secure on-device LLMs suitable for privacy-sensitive and latency-constrained environments, with broad architectural compatibility and minimal impact on performance. The work also discusses limitations and integration with broader defense strategies such as side-channel mitigations for TEEs.

Abstract

Proprietary large language models (LLMs) exhibit strong generalization capabilities across diverse tasks and are increasingly deployed on edge devices for efficiency and privacy reasons. However, deploying proprietary LLMs at the edge without adequate protection introduces critical security threats. Attackers can extract model weights and architectures, enabling unauthorized copying and misuse. Even when protective measures prevent full extraction of model weights, attackers may still perform advanced attacks, such as fine-tuning, to further exploit the model. Existing defenses against these threats typically incur significant computational and communication overhead, making them impractical for edge deployment. To safeguard the edge-deployed LLMs, we introduce CoreGuard, a computation- and communication-efficient protection method. CoreGuard employs an efficient protection protocol to reduce computational overhead and minimize communication overhead via a propagation protocol. Extensive experiments show that CoreGuard achieves upper-bound security protection with negligible overhead.

CoreGuard: Safeguarding Foundational Capabilities of LLMs Against Model Stealing in Edge Deployment

TL;DR

CoreGuard tackles the problem of safeguarding foundational capabilities of edge-deployed LLMs from model stealing and misuse. It introduces a two-phase framework: model locking via row-permutation of input-processing layers and an inference authorization protocol that uses OTP encryption and a propagation mechanism to limit TEE interactions to a single initial authorization. The approach yields strong security—blocking unauthorized usage and nearly matching upper-bound defenses—while incurring negligible computation and communication overhead and preserving accuracy across multiple tasks and models. Practically, CoreGuard enables secure on-device LLMs suitable for privacy-sensitive and latency-constrained environments, with broad architectural compatibility and minimal impact on performance. The work also discusses limitations and integration with broader defense strategies such as side-channel mitigations for TEEs.

Abstract

Proprietary large language models (LLMs) exhibit strong generalization capabilities across diverse tasks and are increasingly deployed on edge devices for efficiency and privacy reasons. However, deploying proprietary LLMs at the edge without adequate protection introduces critical security threats. Attackers can extract model weights and architectures, enabling unauthorized copying and misuse. Even when protective measures prevent full extraction of model weights, attackers may still perform advanced attacks, such as fine-tuning, to further exploit the model. Existing defenses against these threats typically incur significant computational and communication overhead, making them impractical for edge deployment. To safeguard the edge-deployed LLMs, we introduce CoreGuard, a computation- and communication-efficient protection method. CoreGuard employs an efficient protection protocol to reduce computational overhead and minimize communication overhead via a propagation protocol. Extensive experiments show that CoreGuard achieves upper-bound security protection with negligible overhead.

Paper Structure

This paper contains 18 sections, 18 equations, 3 figures, 6 tables.

Figures (3)

  • Figure 1: An overview of CoreGuard. (a) Model locking: before deployment, CoreGuard permutes layers in the original model, thus creating a locked model. (b) Inference authorization: during inference, the input feature of the permuted layers is authorized, which is integrated within the FFN block of the preceding transformer layer.
  • Figure 2: CoreGuard's Defense Effectiveness Against Model Stealing Across Various Attack Settings.
  • Figure 3: Impact of authorization position on security. Model-stealing accuracy is reported for different positions, with the total number of transformer layers indicated for each model.