Table of Contents
Fetching ...

NetGPT: A Native-AI Network Architecture Beyond Provisioning Personalized Generative Services

Yuxuan Chen, Rongpeng Li, Zhifeng Zhao, Chenghui Peng, Jianjun Wu, Ekram Hossain, Honggang Zhang

TL;DR

NetGPT addresses provisioning of personalized generative services by orchestrating cloud-edge LLM collaboration within an AI-native network architecture. It demonstrates feasibility by deploying GPT-2-base at the edge and LLaMA-7B in the cloud with LoRA-based fine-tuning, and formalizes edge-to-cloud prompt enhancement as $P_{\text{com}} = \text{LLM}_{\theta}(P_{\text{con}}; \mathcal{I}_{\text{personalized}})$, with $\theta^{*}$ defined by a minimization over a dataset. The study shows substantial latency and bandwidth advantages over cloud-only deployments, while keeping edge resource usage modest (e.g., approximately $1.65$ GB VRAM) and enabling location-based personalization via edge-generated comprehensive prompts. Beyond generative services, NetGPT proposes an AI-native network architecture with converged C&C, data/privacy protections, and a logical AI workflow to unify network management tasks such as popularity prediction and intent inference. These contributions suggest a practical path toward AI-integrated network control that leverages edge-local personalization and cloud-scale reasoning, while acknowledging challenges in data privacy, online adaptation, and multi-modal extensions.

Abstract

Large language models (LLMs) have triggered tremendous success to empower our daily life by generative information. The personalization of LLMs could further contribute to their applications due to better alignment with human intents. Towards personalized generative services, a collaborative cloud-edge methodology is promising, as it facilitates the effective orchestration of heterogeneous distributed communication and computing resources. In this article, we put forward NetGPT to capably synergize appropriate LLMs at the edge and the cloud based on their computing capacity. In addition, edge LLMs could efficiently leverage location-based information for personalized prompt completion, thus benefiting the interaction with the cloud LLM. In particular, we present the feasibility of NetGPT by leveraging low-rank adaptation-based fine-tuning of open-source LLMs (i.e., GPT-2-base model and LLaMA model), and conduct comprehensive numerical comparisons with alternative cloud-edge collaboration or cloud-only techniques, so as to demonstrate the superiority of NetGPT. Subsequently, we highlight the essential changes required for an artificial intelligence (AI)-native network architecture towards NetGPT, with emphasis on deeper integration of communications and computing resources and careful calibration of logical AI workflow. Furthermore, we demonstrate several benefits of NetGPT, which come as by-products, as the edge LLMs' capability to predict trends and infer intents promises a unified solution for intelligent network management & orchestration. We argue that NetGPT is a promising AI-native network architecture for provisioning beyond personalized generative services.

NetGPT: A Native-AI Network Architecture Beyond Provisioning Personalized Generative Services

TL;DR

NetGPT addresses provisioning of personalized generative services by orchestrating cloud-edge LLM collaboration within an AI-native network architecture. It demonstrates feasibility by deploying GPT-2-base at the edge and LLaMA-7B in the cloud with LoRA-based fine-tuning, and formalizes edge-to-cloud prompt enhancement as , with defined by a minimization over a dataset. The study shows substantial latency and bandwidth advantages over cloud-only deployments, while keeping edge resource usage modest (e.g., approximately GB VRAM) and enabling location-based personalization via edge-generated comprehensive prompts. Beyond generative services, NetGPT proposes an AI-native network architecture with converged C&C, data/privacy protections, and a logical AI workflow to unify network management tasks such as popularity prediction and intent inference. These contributions suggest a practical path toward AI-integrated network control that leverages edge-local personalization and cloud-scale reasoning, while acknowledging challenges in data privacy, online adaptation, and multi-modal extensions.

Abstract

Large language models (LLMs) have triggered tremendous success to empower our daily life by generative information. The personalization of LLMs could further contribute to their applications due to better alignment with human intents. Towards personalized generative services, a collaborative cloud-edge methodology is promising, as it facilitates the effective orchestration of heterogeneous distributed communication and computing resources. In this article, we put forward NetGPT to capably synergize appropriate LLMs at the edge and the cloud based on their computing capacity. In addition, edge LLMs could efficiently leverage location-based information for personalized prompt completion, thus benefiting the interaction with the cloud LLM. In particular, we present the feasibility of NetGPT by leveraging low-rank adaptation-based fine-tuning of open-source LLMs (i.e., GPT-2-base model and LLaMA model), and conduct comprehensive numerical comparisons with alternative cloud-edge collaboration or cloud-only techniques, so as to demonstrate the superiority of NetGPT. Subsequently, we highlight the essential changes required for an artificial intelligence (AI)-native network architecture towards NetGPT, with emphasis on deeper integration of communications and computing resources and careful calibration of logical AI workflow. Furthermore, we demonstrate several benefits of NetGPT, which come as by-products, as the edge LLMs' capability to predict trends and infer intents promises a unified solution for intelligent network management & orchestration. We argue that NetGPT is a promising AI-native network architecture for provisioning beyond personalized generative services.
Paper Structure (23 sections, 6 figures)

This paper contains 23 sections, 6 figures.

Figures (6)

  • Figure 1: An illustration of candidate means to realize the could-edge collaboration for NetGPT and with comparison from alternative cloud-edge frameworks. Specifically, transmission latency is calculated for $10,000$ "concise prompts" with an average size of $12$ bytes (correspondingly $95$-byte "comprehensive prompt") and a transmission rate of $1$ Gbps. For the "LLM Splitting" framework, we take an example of splitting $1/4$ of the LLaMA-7B model at the edge, with $D \approx 10,922$ representing the ratio of intermediate layer data volume to input token size.
  • Figure 2: A framework of collaborative cloud-edge computing towards NetGPT.
  • Figure 3: Comparison between "LLM synergy" framework and cloud-only solution. Top-left: Inferring contextual words following "concise prompts". Top-right: Examples of generated "comprehensive prompts" by regional edge LLMs under "LLM synergy" framework, as well as more personalized cloud LLM responses. Bottom-left: Simpler, non-personalized responses from cloud-only solution for the same prompts. Bottom-right: Numerical comparison between "LLM synergy" and cloud-only frameworks.
  • Figure 4: The illustration of an AI-native network architecture and logical AI workflow for NetGPT.
  • Figure 5: Edge LLM for popularity prediction: From data-sample template, fine-tuning to prediction accuracy.
  • ...and 1 more figures