Distributed Information Bottleneck Theory for Multi-Modal Task-Aware Semantic Communication
Yujie Zhou, Cheng Peng, Rulong Wang, Yong Xiao, Yingyu Li, Guangming Shi, Ping Zhang
TL;DR
This work introduces a task-aware distributed information bottleneck (TADIB) framework to intelligently select and compress data modalities for multi-modal, multi-task semantic communication under resource constraints. By formulating a task-modality score and relaxing discrete modality-task selection into a probabilistic, cooperative policy (pTADIB), the authors enable end-to-end optimization of both modality selection and semantic codecs using variational bounds and gradient-based methods. The approach achieves comparable or better task inference with significantly reduced communication and computation across two public datasets, while providing theoretical guarantees on convergence and optimality under realistic constraints. The practical impact lies in enabling efficient, scalable, and task-tailored semantic communication for next-generation networks like 6G, where heterogeneous modalities and diverse tasks coexist at distributed devices.
Abstract
Semantic communication shifts the focus from bit-level accuracy to task-relevant semantic delivery, enabling efficient and intelligent communication for next-generation networks. However, existing multi-modal solutions often process all available data modalities indiscriminately, ignoring that their contributions to downstream tasks are often unequal. This not only leads to severe resource inefficiency but also degrades task inference performance due to irrelevant or redundant information. To tackle this issue, we propose a novel task-aware distributed information bottleneck (TADIB) framework, which quantifies the contribution of any set of modalities to given tasks. Based on this theoretical framework, we design a practical coding scheme that intelligently selects and compresses only the most task-relevant modalities at the transmitter. To find the optimal selection and the codecs in the network, we adopt the probabilistic relaxation of discrete selection, enabling distributed encoders to make coordinated decisions with score function estimation and common randomness. Extensive experiments on public datasets demonstrate that our solution matches or surpasses the inference quality of full-modal baselines while significantly reducing communication and computational costs.
