Table of Contents
Fetching ...

Remote Training in Task-Oriented Communication: Supervised or Self-Supervised with Fine-Tuning?

Hongru Li, Hang Zhao, Hengtao He, Shenghui Song, Jun Zhang, Khaled B. Letaief

TL;DR

The paper addresses the challenge of high training overhead in remote task-oriented communication within dynamic wireless networks. It introduces a two-stage framework: task-agnostic, label-free self-supervised pre-training of the transmitter, followed by task-specific fine-tuning of both transmitter and receiver using labels. Central to the approach is mutual information maximization, implemented via a tractable InfoNCE objective to learn representations that capture shared, task-relevant information across data views, with theoretical support that self-supervised learning can approach supervised performance under bounded information loss. Empirical results on Omniglot demonstrate substantially reduced training communication rounds (roughly half) compared with full supervision, and that SSL-FT with partial labels can surpass fully labeled baselines, highlighting the practical impact for label-efficient, communication-efficient training in edge-enabled, task-oriented communication systems.

Abstract

Task-oriented communication focuses on extracting and transmitting only the information relevant to specific tasks, effectively minimizing communication overhead. Most existing methods prioritize reducing this overhead during inference, often assuming feasible local training or minimal training communication resources. However, in real-world wireless systems with dynamic connection topologies, training models locally for each new connection is impractical, and task-specific information is often unavailable before establishing connections. Therefore, minimizing training overhead and enabling label-free, task-agnostic pre-training before the connection establishment are essential for effective task-oriented communication. In this paper, we tackle these challenges by employing a mutual information maximization approach grounded in self-supervised learning and information-theoretic analysis. We propose an efficient strategy that pre-trains the transmitter in a task-agnostic and label-free manner, followed by joint fine-tuning of both the transmitter and receiver in a task-specific, label-aware manner. Simulation results show that our proposed method reduces training communication overhead to about half that of full-supervised methods using the SGD optimizer, demonstrating significant improvements in training efficiency.

Remote Training in Task-Oriented Communication: Supervised or Self-Supervised with Fine-Tuning?

TL;DR

The paper addresses the challenge of high training overhead in remote task-oriented communication within dynamic wireless networks. It introduces a two-stage framework: task-agnostic, label-free self-supervised pre-training of the transmitter, followed by task-specific fine-tuning of both transmitter and receiver using labels. Central to the approach is mutual information maximization, implemented via a tractable InfoNCE objective to learn representations that capture shared, task-relevant information across data views, with theoretical support that self-supervised learning can approach supervised performance under bounded information loss. Empirical results on Omniglot demonstrate substantially reduced training communication rounds (roughly half) compared with full supervision, and that SSL-FT with partial labels can surpass fully labeled baselines, highlighting the practical impact for label-efficient, communication-efficient training in edge-enabled, task-oriented communication systems.

Abstract

Task-oriented communication focuses on extracting and transmitting only the information relevant to specific tasks, effectively minimizing communication overhead. Most existing methods prioritize reducing this overhead during inference, often assuming feasible local training or minimal training communication resources. However, in real-world wireless systems with dynamic connection topologies, training models locally for each new connection is impractical, and task-specific information is often unavailable before establishing connections. Therefore, minimizing training overhead and enabling label-free, task-agnostic pre-training before the connection establishment are essential for effective task-oriented communication. In this paper, we tackle these challenges by employing a mutual information maximization approach grounded in self-supervised learning and information-theoretic analysis. We propose an efficient strategy that pre-trains the transmitter in a task-agnostic and label-free manner, followed by joint fine-tuning of both the transmitter and receiver in a task-specific, label-aware manner. Simulation results show that our proposed method reduces training communication overhead to about half that of full-supervised methods using the SGD optimizer, demonstrating significant improvements in training efficiency.

Paper Structure

This paper contains 9 sections, 1 theorem, 22 equations, 2 figures, 2 tables, 1 algorithm.

Key Result

Theorem 1

For a specific view $X$ with perfect information transmission, i.e., $I(Z;X') - I(\hat{Z};X') = 0$, the optimal learned representations from supervised learning satisfy Then, given $X$ and $X' = \textnormal{Aug}(X)$ with information loss $I(Z;X') - I(\hat{Z};X') = \epsilon_c$ due to the wireless channel, the optimal learned representations from self-supervised learning satisfy With the minimal o

Figures (2)

  • Figure 1: Illustration of our proposed framework for communication-efficient remote training in task-oriented communication. In stage I, the main goal is to learn the parameters $\theta$ at the edge device for feature extraction without the need of the label information $Y$ and the downstream task $T$. Specifically, the edge devices use data augmentation techniques to generate pair data samples $(X,X')$ from the original observed data $X$ and learn task-relevant information from the paired data by maximizing $I(\hat{Z},\hat{Z'})$. Furthermore, we assume the channel statistics are known at the edge devices for local training and the model for the representation extraction can be siamese or dual networks. In stage II, once the inference task $T$ and label $Y$ are known, the edge deivces and the central server jointly train $(\theta,\phi)$ for the inference task by maximizing the representation sufficiency $I(\hat{Z},Y|T)$.
  • Figure 2: The communication round versus the test accuracy for the four methods with $\text{SNR}_{\text{test}}=\text{SNR}_{\text{train}}$.

Theorems & Definitions (2)

  • Theorem 1
  • proof