From Prompts to Packets: A View from the Network on ChatGPT, Copilot, and Gemini
Antonio Montieri, Alfredo Nascita, Antonio Pescapè
TL;DR
This work addresses the lack of understanding about mobile network traffic generated by Generative AI chatbots. It develops a dual-dataset approach (60 hours of generic prompts and 90 minutes of controlled prompts) and uses MIRAGE-based capture on Android devices to perform fine-grained trace-, flow-, and protocol-level analyses for ChatGPT, Copilot, and Gemini, including Multimodal Markov Chain modeling and an occlusion-based payload study. The study reveals app- and content-specific traffic patterns, extensive use of TLS and QUIC in different apps, and strong discriminative power of extension bytes for traffic classification, while highlighting substantial upstream traffic and differences from conventional messaging apps. Datasets are publicly released to enable reproducibility and further investigation, and the findings have practical implications for traffic monitoring, network planning, and management in increasingly AI-driven mobile ecosystems. Overall, the paper demonstrates that GenAI chatbot traffic constitutes a distinct mobile workload needing specialized monitoring and provisioning strategies as adoption grows.
Abstract
Generative AI (GenAI) chatbots are now pervasive in digital ecosystems, yet their network traffic remains largely underexplored. This study presents an in-depth investigation of traffic generated by three leading chatbots (ChatGPT, Copilot, and Gemini) when accessed via Android mobile apps for both text and image generation. Using a dedicated capture architecture, we collect and label two complementary workloads: a 60-hour generic dataset with unconstrained prompts, and a controlled dataset built from identical prompts across GenAI apps and replicated via conventional messaging apps to enable one-to-one comparisons. This dual design allows us to address practical research questions on the distinctiveness of GenAI traffic, its differences from widely deployed traffic categories, and its novel implications for network usage. To this end, we provide fine-grained traffic characterization at trace, flow, and protocol levels, and model packet-sequence dynamics with Multimodal Markov Chains. Our analyses reveal app- and content-specific traffic patterns, particularly in volume, uplink/downlink profiles, and protocol adoption. We highlight the predominance of TLS, with Gemini extensively leveraging QUIC, ChatGPT exclusively using TLS 1.3, and app- and content-specific Server Name Indication (SNI) values. A payload-based occlusion analysis quantifies SNI's contribution to classification: masking it reduces F1-score by up to 20 percentage points in GenAI app traffic classification. Finally, compared with conventional messaging apps when carrying the same content, GenAI chatbots exhibit unique traffic characteristics, highlighting new stress factors for mobile networks, such as sustained upstream activity, with direct implications for network monitoring and management. We publicly release the datasets to support reproducibility and foster extensions to other use cases.
