Table of Contents
Fetching ...

Tackling Distribution Shifts in Task-Oriented Communication with Information Bottleneck

Hongru Li, Jiawei Shao, Hengtao He, Shenghui Song, Jun Zhang, Khaled B. Letaief

TL;DR

This work tackles distribution shifts in task-oriented communication by marrying information bottleneck theory with invariant risk minimization to learn compact, domain-generalizable features. It introduces Variational Invariant Feature Encoding (VIFE) to emphasize causal information and suppress domain-dependent spurious cues, and Label-Conditional Feature Encoding (VLFE) to enable semantic-shift detection via label-aware priors and contrastive latent-space separation. The authors derive tractable variational objectives that replace intractable mutual-information terms with learnable surrogates and gradient penalties, enabling end-to-end training without access to test data. Experimental results on Colored-MNIST and Colored-Object demonstrate improved rate-distortion performance and robust semantic-shift detection under open-world distribution shifts, outperforming DeepJSCC baselines and IRM/VIB variants. The proposed framework opens a practical path toward retraining-free, robust task-oriented communication in dynamic environments and motivates future work on time-varying causality and multi-user settings.

Abstract

Task-oriented communication aims to extract and transmit task-relevant information to significantly reduce the communication overhead and transmission latency. However, the unpredictable distribution shifts between training and test data, including domain shift and semantic shift, can dramatically undermine the system performance. In order to tackle these challenges, it is crucial to ensure that the encoded features can generalize to domain-shifted data and detect semanticshifted data, while remaining compact for transmission. In this paper, we propose a novel approach based on the information bottleneck (IB) principle and invariant risk minimization (IRM) framework. The proposed method aims to extract compact and informative features that possess high capability for effective domain-shift generalization and accurate semantic-shift detection without any knowledge of the test data during training. Specifically, we propose an invariant feature encoding approach based on the IB principle and IRM framework for domainshift generalization, which aims to find the causal relationship between the input data and task result by minimizing the complexity and domain dependence of the encoded feature. Furthermore, we enhance the task-oriented communication with the label-dependent feature encoding approach for semanticshift detection which achieves joint gains in IB optimization and detection performance. To avoid the intractable computation of the IB-based objective, we leverage variational approximation to derive a tractable upper bound for optimization. Extensive simulation results on image classification tasks demonstrate that the proposed scheme outperforms state-of-the-art approaches and achieves a better rate-distortion tradeoff.

Tackling Distribution Shifts in Task-Oriented Communication with Information Bottleneck

TL;DR

This work tackles distribution shifts in task-oriented communication by marrying information bottleneck theory with invariant risk minimization to learn compact, domain-generalizable features. It introduces Variational Invariant Feature Encoding (VIFE) to emphasize causal information and suppress domain-dependent spurious cues, and Label-Conditional Feature Encoding (VLFE) to enable semantic-shift detection via label-aware priors and contrastive latent-space separation. The authors derive tractable variational objectives that replace intractable mutual-information terms with learnable surrogates and gradient penalties, enabling end-to-end training without access to test data. Experimental results on Colored-MNIST and Colored-Object demonstrate improved rate-distortion performance and robust semantic-shift detection under open-world distribution shifts, outperforming DeepJSCC baselines and IRM/VIB variants. The proposed framework opens a practical path toward retraining-free, robust task-oriented communication in dynamic environments and motivates future work on time-varying causality and multi-user settings.

Abstract

Task-oriented communication aims to extract and transmit task-relevant information to significantly reduce the communication overhead and transmission latency. However, the unpredictable distribution shifts between training and test data, including domain shift and semantic shift, can dramatically undermine the system performance. In order to tackle these challenges, it is crucial to ensure that the encoded features can generalize to domain-shifted data and detect semanticshifted data, while remaining compact for transmission. In this paper, we propose a novel approach based on the information bottleneck (IB) principle and invariant risk minimization (IRM) framework. The proposed method aims to extract compact and informative features that possess high capability for effective domain-shift generalization and accurate semantic-shift detection without any knowledge of the test data during training. Specifically, we propose an invariant feature encoding approach based on the IB principle and IRM framework for domainshift generalization, which aims to find the causal relationship between the input data and task result by minimizing the complexity and domain dependence of the encoded feature. Furthermore, we enhance the task-oriented communication with the label-dependent feature encoding approach for semanticshift detection which achieves joint gains in IB optimization and detection performance. To avoid the intractable computation of the IB-based objective, we leverage variational approximation to derive a tractable upper bound for optimization. Extensive simulation results on image classification tasks demonstrate that the proposed scheme outperforms state-of-the-art approaches and achieves a better rate-distortion tradeoff.
Paper Structure (27 sections, 28 equations, 8 figures, 3 tables, 1 algorithm)

This paper contains 27 sections, 28 equations, 8 figures, 3 tables, 1 algorithm.

Figures (8)

  • Figure 1: The system model and comparison with previous schemes. The information source $X$ includes task-relevant information (e.g., digit), task-irrelevant information (e.g., color) and unknown data during testing. On-device featurizers extract and transmit $z$, while the server-based inferencers derive the task result $\hat{y}$ from the received feature $\hat{z}$. Previous non-causal-aware methods may easily rely on the task-irrelevant features. In contrast, the proposed method aims to discover the causal relationship between $X$ and $Y$ by disregarding the unstable task-irrelevant information. This is achieved by minimizing the complexity ($\min I(X;\hat{Z})$) and domain dependence ($\min I(Y;D|\hat{Z})$) of $\hat{Z}$, and thus the compact and causal features can be selected from input data $X$.
  • Figure 2: The probabilistic graphical model of data generation, data transmission and task inference of the proposed task-oriented communication system.
  • Figure 3: Examples of distribution shifts on natural dataset and MNIST-related dataset when the task-oriented communication systems are deployed under the open-world assumption. (a) ID data: a horse on grass, domain-shifted data: a horse on snowfield and semantic-shifted data: a deer (b) ID data: gray MNIST dataset, domain-shifted data: Colored-MNIST in which color has a high correlation with digits, and semantic-shifted data: Fashion-MNIST dataset.
  • Figure 4: The Venn diagrams of IB and CCIB show optimization (maximum and minimum) areas in their objectives. Blue lines: the area being minimized by both objectives. Gray lines: the area being maximized by both objectives. The colored arrows indicate the information loss because of the channel noise. IB: The term to be maximized, i.e., $I(\hat{Z};Y)$, is a subset of the term to be minimized, namely, $I(\hat{Z};X)$. LFE: The term to be maximized, i.e., $I(\hat{Z};Y)$, is no longer a subset of the term to be minimized, namely, $I(\hat{Z};X|Y)$.
  • Figure 5: The rate-distortion curves for image classification tasks with $\text{PSNR}_{\text{test}}=\text{PSNR}_{\text{train}}$. (a) Colored-MNIST, $\textrm{PSNR}= 10\textrm{dB}$, (b) Colored-MNIST, $\textrm{PSNR}= 20\textrm{dB}$, (c) Colored-Object, $\textrm{PSNR}= 10\textrm{dB}$ and (d) Colored-Object, $\textrm{PSNR}= 20\textrm{dB}$.
  • ...and 3 more figures