SVIP: Towards Verifiable Inference of Open-source Large Language Models

Yifan Sun; Yuhang Li; Yue Zhang; Yuchen Jin; Huan Zhang

SVIP: Towards Verifiable Inference of Open-source Large Language Models

Yifan Sun, Yuhang Li, Yue Zhang, Yuchen Jin, Huan Zhang

TL;DR

SVIP tackles the problem of verifiable inference for open-source LLMs in decentralized settings by requiring the computing provider to return processed hidden representations along with a proxy-task signal. A simple hidden-state protocol uses a proxy task trained exclusively on the specified model's representations to verify model usage, and a secret-based extension introduces a controllable secret to prevent direct vector optimization attacks. The approach achieves low per-query false negative and false positive rates (below 5% and 3%, respectively) with aggregation over multiple prompts, and remains computationally lightweight (verification under 0.01 seconds per prompt). Extensive experiments across 5 specified LLMs and 6 alternatives demonstrate robustness to adaptive adversaries, with a secret-update mechanism further hardening security and supporting up to 80–120 million prompts between retraining. The work offers a practical, scalable framework to foster trust in open-source LLM inference for users and platforms alike, bridging a gap between security guarantees and real-time deployment needs.

Abstract

The ever-increasing size of open-source Large Language Models (LLMs) renders local deployment impractical for individual users. Decentralized computing has emerged as a cost-effective solution, allowing individuals and small companies to perform LLM inference for users using surplus computational power. However, a computing provider may stealthily substitute the requested LLM with a smaller, less capable model without consent from users, thereby benefiting from cost savings. We introduce SVIP, a secret-based verifiable LLM inference protocol. Unlike existing solutions based on cryptographic or game-theoretic techniques, our method is computationally effective and does not rest on strong assumptions. Our protocol requires the computing provider to return both the generated text and processed hidden representations from LLMs. We then train a proxy task on these representations, effectively transforming them into a unique model identifier. With our protocol, users can reliably verify whether the computing provider is acting honestly. A carefully integrated secret mechanism further strengthens its security. We thoroughly analyze our protocol under multiple strong and adaptive adversarial scenarios. Our extensive experiments demonstrate that SVIP is accurate, generalizable, computationally efficient, and resistant to various attacks. Notably, SVIP achieves false negative rates below 5% and false positive rates below 3%, while requiring less than 0.01 seconds per prompt query for verification.

SVIP: Towards Verifiable Inference of Open-source Large Language Models

TL;DR

Abstract

SVIP: Towards Verifiable Inference of Open-source Large Language Models

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (9)

Theorems & Definitions (3)