Table of Contents
Fetching ...

Retrieval-Augmented Generation for Mobile Edge Computing via Large Language Model

Runtao Ren, Yinyu Wu, Xuhui Zhang, Jinke Ren, Yanyan Shen, Shuqiang Wang, Kim-Fung Tsang

TL;DR

The paper tackles the challenge of real-time, scalable resource allocation in MEC by introducing a retrieval-augmented generation (RAG) framework that combines dense retrieval with large language models (LLMs) to produce adaptive offloading decisions. By storing computing-capability configurations in a vector knowledge base, retrieving context with a Bi-encoder, and generating decisions via LLM prompts, the approach addresses dynamism and interpretability gaps in traditional DL/RL MEC methods. The method is applied to a latency-minimization problem over time slots with coupled variables (offloading ratio, transmit power, and edge-computing allocation) and demonstrates substantial latency reductions across diverse scenarios, with high retrieval quality (MRR and HR) and competitive interpretability. The results indicate that LLM-driven RAG is a promising, scalable direction for MEC decision-making in highly dynamic, multi-user environments, and point to future work on scaling and user-centric improvements using LLMs.

Abstract

The rapid evolution of mobile edge computing (MEC) has introduced significant challenges in optimizing resource allocation in highly dynamic wireless communication systems, in which task offloading decisions should be made in real-time. However, existing resource allocation strategies cannot well adapt to the dynamic and heterogeneous characteristics of MEC systems, since they are short of scalability, context-awareness, and interpretability. To address these issues, this paper proposes a novel retrieval-augmented generation (RAG) method to improve the performance of MEC systems. Specifically, a latency minimization problem is first proposed to jointly optimize the data offloading ratio, transmit power allocation, and computing resource allocation. Then, an LLM-enabled information-retrieval mechanism is proposed to solve the problem efficiently. Extensive experiments across multi-user, multi-task, and highly dynamic offloading scenarios show that the proposed method consistently reduces latency compared to several DL-based approaches, achieving 57% improvement under varying user computing ability, 86% with different servers, 30% under distinct transmit powers, and 42% for varying data volumes. These results show the effectiveness of LLM-driven solutions to solve the resource allocation problems in MEC systems.

Retrieval-Augmented Generation for Mobile Edge Computing via Large Language Model

TL;DR

The paper tackles the challenge of real-time, scalable resource allocation in MEC by introducing a retrieval-augmented generation (RAG) framework that combines dense retrieval with large language models (LLMs) to produce adaptive offloading decisions. By storing computing-capability configurations in a vector knowledge base, retrieving context with a Bi-encoder, and generating decisions via LLM prompts, the approach addresses dynamism and interpretability gaps in traditional DL/RL MEC methods. The method is applied to a latency-minimization problem over time slots with coupled variables (offloading ratio, transmit power, and edge-computing allocation) and demonstrates substantial latency reductions across diverse scenarios, with high retrieval quality (MRR and HR) and competitive interpretability. The results indicate that LLM-driven RAG is a promising, scalable direction for MEC decision-making in highly dynamic, multi-user environments, and point to future work on scaling and user-centric improvements using LLMs.

Abstract

The rapid evolution of mobile edge computing (MEC) has introduced significant challenges in optimizing resource allocation in highly dynamic wireless communication systems, in which task offloading decisions should be made in real-time. However, existing resource allocation strategies cannot well adapt to the dynamic and heterogeneous characteristics of MEC systems, since they are short of scalability, context-awareness, and interpretability. To address these issues, this paper proposes a novel retrieval-augmented generation (RAG) method to improve the performance of MEC systems. Specifically, a latency minimization problem is first proposed to jointly optimize the data offloading ratio, transmit power allocation, and computing resource allocation. Then, an LLM-enabled information-retrieval mechanism is proposed to solve the problem efficiently. Extensive experiments across multi-user, multi-task, and highly dynamic offloading scenarios show that the proposed method consistently reduces latency compared to several DL-based approaches, achieving 57% improvement under varying user computing ability, 86% with different servers, 30% under distinct transmit powers, and 42% for varying data volumes. These results show the effectiveness of LLM-driven solutions to solve the resource allocation problems in MEC systems.
Paper Structure (20 sections, 17 equations, 9 figures, 2 tables)

This paper contains 20 sections, 17 equations, 9 figures, 2 tables.

Figures (9)

  • Figure 1: Multiuser MEC system model.
  • Figure 2: The framework of RAG.
  • Figure 3: The framework of Bi-encoder.
  • Figure 4: The framework of RAG for the MEC system.
  • Figure 5: The prompt of LLM for decision
  • ...and 4 more figures