Table of Contents
Fetching ...

Aligning Task- and Reconstruction-Oriented Communications for Edge Intelligence

Yufeng Diao, Yichi Zhang, Changyang She, Philip Guodong Zhao, Emma Liying Li

TL;DR

The paper tackles the inefficiency of reconstruction-focused communications for AI-driven edge tasks by introducing ATROC, a framework that aligns task-oriented and reconstruction-oriented paradigms via an extended Information Bottleneck with an information reshaper and a variational approach. It couples a JSCC modulation that remains compatible with classical digital infrastructures to preserve task-relevant information while maintaining data structure, enabling end-to-end training for edge autonomous driving. Key contributions include the ATROC framework, a variational IB objective with tractable approximations, a learnable JSCC constellation, and task-oriented end-to-end training tailored to edge-based driving in CARLA, achieving substantial bits-per-service reductions without sacrificing driving performance. The practical impact is enabling efficient, real-time edge AI that can plug into existing networks and hardware, offering significant bandwidth savings in bandwidth-constrained edge deployments.

Abstract

Existing communication systems aim to reconstruct the information at the receiver side, and are known as reconstruction-oriented communications. This approach often falls short in meeting the real-time, task-specific demands of modern AI-driven applications such as autonomous driving and semantic segmentation. As a new design principle, task-oriented communications have been developed. However, it typically requires joint optimization of encoder, decoder, and modified inference neural networks, resulting in extensive cross-system redesigns and compatibility issues. This paper proposes a novel communication framework that aligns reconstruction-oriented and task-oriented communications for edge intelligence. The idea is to extend the Information Bottleneck (IB) theory to optimize data transmission by minimizing task-relevant loss function, while maintaining the structure of the original data by an information reshaper. Such an approach integrates task-oriented communications with reconstruction-oriented communications, where a variational approach is designed to handle the intractability of mutual information in high-dimensional neural network features. We also introduce a joint source-channel coding (JSCC) modulation scheme compatible with classical modulation techniques, enabling the deployment of AI technologies within existing digital infrastructures. The proposed framework is particularly effective in edge-based autonomous driving scenarios. Our evaluation in the Car Learning to Act (CARLA) simulator demonstrates that the proposed framework significantly reduces bits per service by 99.19% compared to existing methods, such as JPEG, JPEG2000, and BPG, without compromising the effectiveness of task execution.

Aligning Task- and Reconstruction-Oriented Communications for Edge Intelligence

TL;DR

The paper tackles the inefficiency of reconstruction-focused communications for AI-driven edge tasks by introducing ATROC, a framework that aligns task-oriented and reconstruction-oriented paradigms via an extended Information Bottleneck with an information reshaper and a variational approach. It couples a JSCC modulation that remains compatible with classical digital infrastructures to preserve task-relevant information while maintaining data structure, enabling end-to-end training for edge autonomous driving. Key contributions include the ATROC framework, a variational IB objective with tractable approximations, a learnable JSCC constellation, and task-oriented end-to-end training tailored to edge-based driving in CARLA, achieving substantial bits-per-service reductions without sacrificing driving performance. The practical impact is enabling efficient, real-time edge AI that can plug into existing networks and hardware, offering significant bandwidth savings in bandwidth-constrained edge deployments.

Abstract

Existing communication systems aim to reconstruct the information at the receiver side, and are known as reconstruction-oriented communications. This approach often falls short in meeting the real-time, task-specific demands of modern AI-driven applications such as autonomous driving and semantic segmentation. As a new design principle, task-oriented communications have been developed. However, it typically requires joint optimization of encoder, decoder, and modified inference neural networks, resulting in extensive cross-system redesigns and compatibility issues. This paper proposes a novel communication framework that aligns reconstruction-oriented and task-oriented communications for edge intelligence. The idea is to extend the Information Bottleneck (IB) theory to optimize data transmission by minimizing task-relevant loss function, while maintaining the structure of the original data by an information reshaper. Such an approach integrates task-oriented communications with reconstruction-oriented communications, where a variational approach is designed to handle the intractability of mutual information in high-dimensional neural network features. We also introduce a joint source-channel coding (JSCC) modulation scheme compatible with classical modulation techniques, enabling the deployment of AI technologies within existing digital infrastructures. The proposed framework is particularly effective in edge-based autonomous driving scenarios. Our evaluation in the Car Learning to Act (CARLA) simulator demonstrates that the proposed framework significantly reduces bits per service by 99.19% compared to existing methods, such as JPEG, JPEG2000, and BPG, without compromising the effectiveness of task execution.

Paper Structure

This paper contains 29 sections, 40 equations, 8 figures, 2 tables, 2 algorithms.

Figures (8)

  • Figure 1: Comparison of three JSCC-enabled communication frameworks for edge inference: Reconstruction-oriented, non-aligned task-oriented, and ATROC frameworks. All three frameworks can share a similar JSCC encoder structure on the device side. On the edge side, reconstruction-oriented communication aims to fully reconstruct the input data, including both task-specific and task-agnostic information. In contrast, non-aligned task-oriented communication focuses solely on preserving task-specific information and uses JSCC symbols directly for inference. ATROC merges the benefits of the previous two by transferring task-specific information and ensuring that data structures are compatible with existing AI agent networks, enhancing integration and efficiency.
  • Figure 2: An example of the JSCC modulation and signal transmission procedure for $\bm{z} \in \mathbb{C}^4$ using 16-QAM.
  • Figure 3: Architecture of the proposed JSCC encoder and information reshaper. For example, ConvC 3-1 represents a convolutional layer with $C$ channels, a $3\times3$ kernel size, and padding of 1 on both sides. $\downarrow$2 denotes the strided down convolutions, while NN$\uparrow$2 denotes the nearest neighbor upsampling. FC2048 refers to a fully connected layer with an output size of 2048. BatchNorm denotes batch normalization, LReLU represents the leaky ReLU activation with $\alpha=0.2$, and $\Omega$ represents the batch size. The dimensions (number of channels) of the inputs and outputs for the ResBlock remain unchanged.
  • Figure 4: Training of the constellation parameter for 16-QAM, 64-QAM, and 256-QAM. Regardless of the initial value of the constellation parameter, the optimal value consistently converges.
  • Figure 5: Driving score of fine-tuned models based on 64-QAM with different constellation parameters ($r\in \{1, r^*, 10\}$, where $r^*=3.04$) under the AWGN channel with SNR range from -10 dB to 10 dB.
  • ...and 3 more figures