LaMI-GO: Latent Mixture Integration for Goal-Oriented Communications Achieving High Spectrum Efficiency
Achintha Wijesinghe, Suchinthaka Wanninayaka, Weiwei Wang, Yu-Chieh Chao, Songyang Zhang, Zhi Ding
TL;DR
LaMI-GO introduces a latent-domain GO-COM framework that combines a shared codebook with latent diffusion (via a text-conditioned Paella backbone) to achieve high spectrum efficiency for goal-oriented tasks. It replaces pixel-domain reconstruction with a latent index-based representation and employs latent mixture integration to recover images without retraining, achieving strong perceptual quality and improved downstream task performance under constrained bandwidth. The approach demonstrates robustness to channel noise and packet loss, outperforming state-of-the-art GO-COM methods in reconstruction quality and bandwidth efficiency, while maintaining practical showtime speeds. This work highlights the potential of latent diffusion with structured masking strategies (PRM, PDM, EBM) for scalable, privacy-preserving, and task-driven communications in future wireless systems.
Abstract
The recent rise of semantic-style communications includes the development of goal-oriented communications (GOCOMs) remarkably efficient multimedia information transmissions. The concept of GO-COMS leverages advanced artificial intelligence (AI) tools to address the rising demand for bandwidth efficiency in applications, such as edge computing and Internet-of-Things (IoT). Unlike traditional communication systems focusing on source data accuracy, GO-COMs provide intelligent message delivery catering to the special needs critical to accomplishing downstream tasks at the receiver. In this work, we present a novel GO-COM framework, namely LaMI-GO that utilizes emerging generative AI for better quality-of-service (QoS) with ultra-high communication efficiency. Specifically, we design our LaMI-GO system backbone based on a latent diffusion model followed by a vector-quantized generative adversarial network (VQGAN) for efficient latent embedding and information representation. The system trains a common feature codebook the receiver side. Our experimental results demonstrate substantial improvement in perceptual quality, accuracy of downstream tasks, and bandwidth consumption over the state-of-the-art GOCOM systems and establish the power of our proposed LaMI-GO communication framework.
