Table of Contents
Fetching ...

PACE: A Pragmatic Agent for Enhancing Communication Efficiency Using Large Language Models

Jiaxuan Li, Minxi Yang, Dahua Gao, Wenlong Xu, Guangming Shi

TL;DR

PACE presents a training-free pragmatic communication framework that leverages an LLM-Agent to perform semantic perception, intention resolution, and intent-oriented encoding for image transmission. By coupling specialized prompts, a knowledge base of rate-distortion curves, and a Chain of Thought guiding resource allocation, PACE achieves higher transmission efficiency for intention-aligned regions while tolerating lower fidelity elsewhere. Experimental results on COCO/Flickr datasets demonstrate advantages over traditional and non-LLM baselines, particularly at higher intention matching levels, with a comprehensive evaluation using both perceptual and task-oriented metrics. The approach offers a practical path toward universal pragmatic communication leveraging existing LLM capabilities without task-specific training.

Abstract

Current communication technologies face limitations in terms of theoretical capacity, spectrum availability, and power resources. Pragmatic communication, leveraging terminal intelligence for selective data transmission, offers resource conservation. Existing research lacks universal intention resolution tools, limiting applicability to specific tasks. This paper proposes an image pragmatic communication framework based on a Pragmatic Agent for Communication Efficiency (PACE) using Large Language Models (LLM). In this framework, PACE sequentially performs semantic perception, intention resolution, and intention-oriented coding. To ensure the effective utilization of LLM in communication, a knowledge base is designed to supplement the necessary knowledge, dedicated prompts are introduced to facilitate understanding of pragmatic communication scenarios and task requirements, and a chain of thought is designed to assist in making reasonable trade-offs between transmission efficiency and cost. For experimental validation, this paper constructs an image pragmatic communication dataset along with corresponding evaluation standards. Simulation results indicate that the proposed method outperforms traditional and non-LLM-based pragmatic communication in terms of transmission efficiency.

PACE: A Pragmatic Agent for Enhancing Communication Efficiency Using Large Language Models

TL;DR

PACE presents a training-free pragmatic communication framework that leverages an LLM-Agent to perform semantic perception, intention resolution, and intent-oriented encoding for image transmission. By coupling specialized prompts, a knowledge base of rate-distortion curves, and a Chain of Thought guiding resource allocation, PACE achieves higher transmission efficiency for intention-aligned regions while tolerating lower fidelity elsewhere. Experimental results on COCO/Flickr datasets demonstrate advantages over traditional and non-LLM baselines, particularly at higher intention matching levels, with a comprehensive evaluation using both perceptual and task-oriented metrics. The approach offers a practical path toward universal pragmatic communication leveraging existing LLM capabilities without task-specific training.

Abstract

Current communication technologies face limitations in terms of theoretical capacity, spectrum availability, and power resources. Pragmatic communication, leveraging terminal intelligence for selective data transmission, offers resource conservation. Existing research lacks universal intention resolution tools, limiting applicability to specific tasks. This paper proposes an image pragmatic communication framework based on a Pragmatic Agent for Communication Efficiency (PACE) using Large Language Models (LLM). In this framework, PACE sequentially performs semantic perception, intention resolution, and intention-oriented coding. To ensure the effective utilization of LLM in communication, a knowledge base is designed to supplement the necessary knowledge, dedicated prompts are introduced to facilitate understanding of pragmatic communication scenarios and task requirements, and a chain of thought is designed to assist in making reasonable trade-offs between transmission efficiency and cost. For experimental validation, this paper constructs an image pragmatic communication dataset along with corresponding evaluation standards. Simulation results indicate that the proposed method outperforms traditional and non-LLM-based pragmatic communication in terms of transmission efficiency.
Paper Structure (28 sections, 11 figures, 2 tables)

This paper contains 28 sections, 11 figures, 2 tables.

Figures (11)

  • Figure 1: Weaver and Shannon categorized communication into three levels: technical, semantic, and efficiency (or pragmatic). The technical level focuses solely on symbol transmission, the semantic level emphasizes the transmission of the meaning represented by symbols, and the pragmatic level concerns the transmission of meaning in line with intentions. Semantic communication involves extracting meaning from symbols, removing meaningless parts to reduce transmission volume. Pragmatic communication, based on requirements, filters the semantics, performing pragmatic extraction to eliminate unnecessary portions and further reduce transmission volume. The thickness of the arrows in the diagram illustrates the difference in transmission volume.
  • Figure 2: The proposed IPC framework based on PACE follows a process that includes image semantic perception, intention comprehension and reasoning, intent-adaptive image encoding and decoding, and intent-oriented evaluation. Modules marked with a snowflake indicate that no training is required for the entire process.
  • Figure 3: The comparison involves PACE, intention-agnostic, text-similarity-based, and CLIP-based methods. Regions within the red boxes correspond to satisfying the intention (positive regions), while the black regions represent not satisfying the intention (negative regions).
  • Figure 4: The line chart depicting the impact of bit length, where the plus and minus signs in the method names represent the metrics when matching (level 3) and not matching (level 1&2) with the intention, respectively.
  • Figure A1: This prompt defines the role that LLM plays in this task.
  • ...and 6 more figures