Table of Contents
Fetching ...

End-to-End Learning for Task-Oriented Semantic Communications Over MIMO Channels: An Information-Theoretic Framework

Chang Cai, Xiaojun Yuan, Ying-Jun Angela Zhang

TL;DR

A decoupled pretraining framework that separately trains the feature encoder and the MIMO precoder, with a maximum a posteriori (MAP) classifier employed at the server to generate the inference result, and two deep unfolded precoding networks that effectively incorporate the domain knowledge of the solution to the decoupled precoding problem.

Abstract

This paper addresses the problem of end-to-end (E2E) design of learning and communication in a task-oriented semantic communication system. In particular, we consider a multi-device cooperative edge inference system over a wireless multiple-input multiple-output (MIMO) multiple access channel, where multiple devices transmit extracted features to a server to perform a classification task. We formulate the E2E design of feature encoding, MIMO precoding, and classification as a conditional mutual information maximization problem. However, it is notoriously difficult to design and train an E2E network that can be adaptive to both the task dataset and different channel realizations. Regarding network training, we propose a decoupled pretraining framework that separately trains the feature encoder and the MIMO precoder, with a maximum a posteriori (MAP) classifier employed at the server to generate the inference result. The feature encoder is pretrained exclusively using the task dataset, while the MIMO precoder is pretrained solely based on the channel and noise distributions. Nevertheless, we manage to align the pretraining objectives of each individual component with the E2E learning objective, so as to approach the performance bound of E2E learning. By leveraging the decoupled pretraining results for initialization, the E2E learning can be conducted with minimal training overhead. Regarding network architecture design, we develop two deep unfolded precoding networks that effectively incorporate the domain knowledge of the solution to the decoupled precoding problem. Simulation results on both the CIFAR-10 and ModelNet10 datasets verify that the proposed method achieves significantly higher classification accuracy compared to various baselines.

End-to-End Learning for Task-Oriented Semantic Communications Over MIMO Channels: An Information-Theoretic Framework

TL;DR

A decoupled pretraining framework that separately trains the feature encoder and the MIMO precoder, with a maximum a posteriori (MAP) classifier employed at the server to generate the inference result, and two deep unfolded precoding networks that effectively incorporate the domain knowledge of the solution to the decoupled precoding problem.

Abstract

This paper addresses the problem of end-to-end (E2E) design of learning and communication in a task-oriented semantic communication system. In particular, we consider a multi-device cooperative edge inference system over a wireless multiple-input multiple-output (MIMO) multiple access channel, where multiple devices transmit extracted features to a server to perform a classification task. We formulate the E2E design of feature encoding, MIMO precoding, and classification as a conditional mutual information maximization problem. However, it is notoriously difficult to design and train an E2E network that can be adaptive to both the task dataset and different channel realizations. Regarding network training, we propose a decoupled pretraining framework that separately trains the feature encoder and the MIMO precoder, with a maximum a posteriori (MAP) classifier employed at the server to generate the inference result. The feature encoder is pretrained exclusively using the task dataset, while the MIMO precoder is pretrained solely based on the channel and noise distributions. Nevertheless, we manage to align the pretraining objectives of each individual component with the E2E learning objective, so as to approach the performance bound of E2E learning. By leveraging the decoupled pretraining results for initialization, the E2E learning can be conducted with minimal training overhead. Regarding network architecture design, we develop two deep unfolded precoding networks that effectively incorporate the domain knowledge of the solution to the decoupled precoding problem. Simulation results on both the CIFAR-10 and ModelNet10 datasets verify that the proposed method achieves significantly higher classification accuracy compared to various baselines.
Paper Structure (26 sections, 1 theorem, 50 equations, 9 figures, 5 tables, 3 algorithms)

This paper contains 26 sections, 1 theorem, 50 equations, 9 figures, 5 tables, 3 algorithms.

Key Result

Proposition 1

Let $\mathbf{L}, \mathbf{M} \in \mathbb{H}^n$ such that $\mathbf{M} \succeq \mathbf{L}$. The function $\mathbf{v}^{\sf H} \mathbf{L} \mathbf{v}$ with $\mathbf{v} \in \mathbb{C}^n$ is majorized at any point $\underline{\mathbf{v}} \in \mathbb{C}^n$ by

Figures (9)

  • Figure 1: Block diagram and Markov model of the considered multi-device edge inference system.
  • Figure 2: (a) Illustration of the data processing inequality in \ref{['data-processing']}; (b) The first step aims to maximize $I(Z_{1:K}; Y)$, while $I(R;Y|S)$ can either increase or decrease; (c) The second step aims to maximize $I(R;Y|S)$.
  • Figure 3: The $\ell$-th layer of the vanilla DU-BCA precoder.
  • Figure 4: The $\ell$-th layer of the enhanced DU-BCA-MM precoder.
  • Figure 5: Testing accuracy at each training epoch in the E2E learning phase, where the power budget $P_0 = 15$ dBm and the number of transmit time slot $O=1$. The experiments are carried out on ModelNet10.
  • ...and 4 more figures

Theorems & Definitions (6)

  • Remark 1
  • Remark 2
  • Remark 3
  • Remark 4
  • Remark 5
  • Proposition 1: palomar2018icassp