Trustworthy Image Semantic Communication with GenAI: Explainablity, Controllability, and Efficiency
Xijun Wang, Dongshan Ye, Chenyuan Feng, Howard H. Yang, Xiang Chen, Tony Q. S. Quek
TL;DR
This work addresses the interpretability, operability, and compatibility gaps in image semantic communication by proposing a trustworthy ISC framework that decouples transmit and receive processes and leverages explainable semantics in the form of image semantic text and segmentation maps. The receiver employs GenAI based reconstruction and multitask processing guided by a semantic level multi rate transmission protocol, with a correlation driven feedback loop, a policy controller, and a shared vector database to adapt data transmission to task requirements. Experimental results on COCO show substantial improvements in image captioning quality, competitive or superior reconstruction fidelity, and significant transmission efficiency gains, including up to 90% data reduction in semantic transmission. The framework demonstrates strong potential for flexible, task-aware, and efficient ISC in future 6G scenarios, while outlining open issues such as device constraints, privacy, and personalized transmission for real-world deployment.
Abstract
Image semantic communication (ISC) has garnered significant attention for its potential to achieve high efficiency in visual content transmission. However, existing ISC systems based on joint source-channel coding face challenges in interpretability, operability, and compatibility. To address these limitations, we propose a novel trustworthy ISC framework. This approach leverages text extraction and segmentation mapping techniques to convert images into explainable semantics, while employing Generative Artificial Intelligence (GenAI) for multiple downstream inference tasks. We also introduce a multi-rate ISC transmission protocol that dynamically adapts to both the received explainable semantic content and specific task requirements at the receiver. Simulation results demonstrate that our framework achieves explainable learning, decoupled training, and compatible transmission in various application scenarios. Finally, some intriguing research directions and application scenarios are identified.
