Invisible Tokens, Visible Bills: The Urgent Need to Audit Hidden Operations in Opaque LLM Services
Guoheng Sun, Ziyao Wang, Xuandong Zhao, Bowei Tian, Zheyu Shen, Yexiao He, Jinming Xing, Ang Li
TL;DR
Addresses the transparency gap in Commercial Opaque LLM Services (COLS) where users are billed for hidden operations. The authors formalize actual quantities $T_Q$, $C_Q$ and unit qualities $T_q$, $C_q$, and analyze two failure modes: quantity inflation and quality downgrade. They propose a three-layer auditing framework and a suite of auditing methods, including commitment-based, predictive, behavioral, and signature-based approaches, with optional watermarking and TEEs. The framework aims to provide verifiable, privacy-preserving auditability that balances provider confidentiality with user accountability, guiding policy and governance in commercial LLM ecosystems.
Abstract
Modern large language model (LLM) services increasingly rely on complex, often abstract operations, such as multi-step reasoning and multi-agent collaboration, to generate high-quality outputs. While users are billed based on token consumption and API usage, these internal steps are typically not visible. We refer to such systems as Commercial Opaque LLM Services (COLS). This position paper highlights emerging accountability challenges in COLS: users are billed for operations they cannot observe, verify, or contest. We formalize two key risks: \textit{quantity inflation}, where token and call counts may be artificially inflated, and \textit{quality downgrade}, where providers might quietly substitute lower-cost models or tools. Addressing these risks requires a diverse set of auditing strategies, including commitment-based, predictive, behavioral, and signature-based methods. We further explore the potential of complementary mechanisms such as watermarking and trusted execution environments to enhance verifiability without compromising provider confidentiality. We also propose a modular three-layer auditing framework for COLS and users that enables trustworthy verification across execution, secure logging, and user-facing auditability without exposing proprietary internals. Our aim is to encourage further research and policy development toward transparency, auditability, and accountability in commercial LLM services.
