Table of Contents
Fetching ...

NetEcho: From Real-World Streaming Side-Channels to Full LLM Conversation Recovery

Zheng Zhang, Guanlong Wu, Sen Deng, Shuai Wang, Yinqian Zhang

TL;DR

This work addresses the security risk posed by network side channels in real-time LLM streaming interfaces. It introduces NetEcho, a novel framework that combines multi-dimensional traffic traces with LLM-based reasoning to recover complete user–LLM conversations from encrypted traffic, including prompts and model outputs. Through systematic analysis of seven deployment scenarios and extensive evaluations on medical and legal domains, NetEcho achieves high recovery rates (up to $\approx$95% average success, with some cases at $100\%$) and demonstrates strong generalization across real-world conditions, including imbalanced and out-of-distribution data. The findings reveal that current defenses—both active padding and passive batching—are insufficient in practice, underscoring the need for robust application-layer defenses in streaming protocols to mitigate deep privacy leakage and inform future security tooling.

Abstract

In the rapidly expanding landscape of Large Language Model (LLM) applications, real-time output streaming has become the dominant interaction paradigm. While this enhances user experience, recent research reveals that it exposes a non-trivial attack surface through network side-channels. Adversaries can exploit patterns in encrypted traffic to infer sensitive information and reconstruct private conversations. In response, LLM providers and third-party services are deploying defenses such as traffic padding and obfuscation to mitigate these vulnerabilities. This paper starts by presenting a systematic analysis of contemporary side-channel defenses in mainstream LLM applications, with a focus on services from vendors like OpenAI and DeepSeek. We identify and examine seven representative deployment scenarios, each incorporating active/passive mitigation techniques. Despite these enhanced security measures, our investigation uncovers significant residual information that remains vulnerable to leakage within the network traffic. Building on this discovery, we introduce NetEcho, a novel, LLM-based framework that comprehensively unleashes the network side-channel risks of today's LLM applications. NetEcho is designed to recover entire conversations -- including both user prompts and LLM responses -- directly from encrypted network traffic. It features a deliberate design that ensures high-fidelity text recovery, transferability across different deployment scenarios, and moderate operational cost. In our evaluations on medical and legal applications built upon leading models like DeepSeek-v3 and GPT-4o, NetEcho can recover avg $\sim$70\% information of each conversation, demonstrating a critical limitation in current defense mechanisms. We conclude by discussing the implications of our findings and proposing future directions for augmenting network traffic security.

NetEcho: From Real-World Streaming Side-Channels to Full LLM Conversation Recovery

TL;DR

This work addresses the security risk posed by network side channels in real-time LLM streaming interfaces. It introduces NetEcho, a novel framework that combines multi-dimensional traffic traces with LLM-based reasoning to recover complete user–LLM conversations from encrypted traffic, including prompts and model outputs. Through systematic analysis of seven deployment scenarios and extensive evaluations on medical and legal domains, NetEcho achieves high recovery rates (up to 95% average success, with some cases at ) and demonstrates strong generalization across real-world conditions, including imbalanced and out-of-distribution data. The findings reveal that current defenses—both active padding and passive batching—are insufficient in practice, underscoring the need for robust application-layer defenses in streaming protocols to mitigate deep privacy leakage and inform future security tooling.

Abstract

In the rapidly expanding landscape of Large Language Model (LLM) applications, real-time output streaming has become the dominant interaction paradigm. While this enhances user experience, recent research reveals that it exposes a non-trivial attack surface through network side-channels. Adversaries can exploit patterns in encrypted traffic to infer sensitive information and reconstruct private conversations. In response, LLM providers and third-party services are deploying defenses such as traffic padding and obfuscation to mitigate these vulnerabilities. This paper starts by presenting a systematic analysis of contemporary side-channel defenses in mainstream LLM applications, with a focus on services from vendors like OpenAI and DeepSeek. We identify and examine seven representative deployment scenarios, each incorporating active/passive mitigation techniques. Despite these enhanced security measures, our investigation uncovers significant residual information that remains vulnerable to leakage within the network traffic. Building on this discovery, we introduce NetEcho, a novel, LLM-based framework that comprehensively unleashes the network side-channel risks of today's LLM applications. NetEcho is designed to recover entire conversations -- including both user prompts and LLM responses -- directly from encrypted network traffic. It features a deliberate design that ensures high-fidelity text recovery, transferability across different deployment scenarios, and moderate operational cost. In our evaluations on medical and legal applications built upon leading models like DeepSeek-v3 and GPT-4o, NetEcho can recover avg 70\% information of each conversation, demonstrating a critical limitation in current defense mechanisms. We conclude by discussing the implications of our findings and proposing future directions for augmenting network traffic security.

Paper Structure

This paper contains 28 sections, 5 equations, 11 figures, 10 tables.

Figures (11)

  • Figure 1: A medical LLM application which receives user prompts (questionnaire) and outputs preliminary diagnosis. Adversaries can monitor encrypted streaming.
  • Figure 2: Given traces $\boldsymbol{\tau}$ generated from victim conversation $\mathbf{c}$, these traces enable the localization of the semantic space, in where LLMs can efficiently search for optimal recovery. Results serve as the refined reference of the next iteration, which further narrows down the semantic space.
  • Figure 3: Deepseek traffic characteristics.
  • Figure 4: Distribution of token count per packet.
  • Figure 5: Even the upstream API adopts defense, the downstream interface vercel2025ai still interacts with users token-by-token.
  • ...and 6 more figures