Multi-Task Semantic Communications via Large Models
Wanli Ni, Zhijin Qin, Haofeng Sun, Xiaoming Tao, Zhu Han
TL;DR
This work addresses efficient semantic communication for multi-modal, multi-task tasks on resource-constrained networks by integrating large AI models into a unified MTSC framework. It introduces an LAM-based MTSC architecture with adaptive model compression, federated split fine-tuning, retrieval-augmented generation, and an importance-aware semantic transmission scheme to maintain up-to-date semantics and robust performance. The approach demonstrates superior task accuracy and reconstruction quality across multiple modalities and downlink tasks under varying channel conditions compared with two baselines. The results highlight the viability of deploying LAMs at the network edge for end-to-end SemCom and multi-task reasoning, with implications for 6G-era intelligent communications.
Abstract
Artificial intelligence (AI) promises to revolutionize the design, optimization and management of next-generation communication systems. In this article, we explore the integration of large AI models (LAMs) into semantic communications (SemCom) by leveraging their multi-modal data processing and generation capabilities. Although LAMs bring unprecedented abilities to extract semantics from raw data, this integration entails multifaceted challenges including high resource demands, model complexity, and the need for adaptability across diverse modalities and tasks. To overcome these challenges, we propose a LAM-based multi-task SemCom (MTSC) architecture, which includes an adaptive model compression strategy and a federated split fine-tuning approach to facilitate the efficient deployment of LAM-based semantic models in resource-limited networks. Furthermore, a retrieval-augmented generation scheme is implemented to synthesize the most recent local and global knowledge bases to enhance the accuracy of semantic extraction and content generation, thereby improving the inference performance. Finally, simulation results demonstrate the efficacy of the proposed LAM-based MTSC architecture, highlighting the performance enhancements across various downstream tasks under varying channel conditions.
