Table of Contents
Fetching ...

Knowledge Distillation for mmWave Beam Prediction Using Sub-6 GHz Channels

Sina Tavakolian, Nhan Thanh Nguyen, Ahmed Alkhateeb, Markku Juntti

TL;DR

This work tackles the challenge of high beam training overhead in mmWave systems by transferring knowledge from a large, high-performing teacher network to compact student networks. The authors formulate sub-6 GHz–to–mmWave beam mapping as a classification task and demonstrate a KD-based framework with IKD, RKD, and self-distillation to produce lightweight models that closely match the teacher’s beam-prediction accuracy and spectral efficiency. Empirical results on DeepMIMO datasets show up to 99% reduction in trainable parameters and FLOPs, with RKD offering a slight performance edge over IKD and both surpassing a non-distilled baseline. The approach enables practical, low-complexity DL solutions for real-time mmWave beamforming in high-mobility scenarios, with potential extensions to dynamic antenna selection and reduced RF chains.

Abstract

Beamforming in millimeter-wave (mmWave) high-mobility environments typically incurs substantial training overhead. While prior studies suggest that sub-6 GHz channels can be exploited to predict optimal mmWave beams, existing methods depend on large deep learning (DL) models with prohibitive computational and memory requirements. In this paper, we propose a computationally efficient framework for sub-6 GHz channel-mmWave beam mapping based on the knowledge distillation (KD) technique. We develop two compact student DL architectures based on individual and relational distillation strategies, which retain only a few hidden layers yet closely mimic the performance of large teacher DL models. Extensive simulations demonstrate that the proposed student models achieve the teacher's beam prediction accuracy and spectral efficiency while reducing trainable parameters and computational complexity by 99%.

Knowledge Distillation for mmWave Beam Prediction Using Sub-6 GHz Channels

TL;DR

This work tackles the challenge of high beam training overhead in mmWave systems by transferring knowledge from a large, high-performing teacher network to compact student networks. The authors formulate sub-6 GHz–to–mmWave beam mapping as a classification task and demonstrate a KD-based framework with IKD, RKD, and self-distillation to produce lightweight models that closely match the teacher’s beam-prediction accuracy and spectral efficiency. Empirical results on DeepMIMO datasets show up to 99% reduction in trainable parameters and FLOPs, with RKD offering a slight performance edge over IKD and both surpassing a non-distilled baseline. The approach enables practical, low-complexity DL solutions for real-time mmWave beamforming in high-mobility scenarios, with potential extensions to dynamic antenna selection and reduced RF chains.

Abstract

Beamforming in millimeter-wave (mmWave) high-mobility environments typically incurs substantial training overhead. While prior studies suggest that sub-6 GHz channels can be exploited to predict optimal mmWave beams, existing methods depend on large deep learning (DL) models with prohibitive computational and memory requirements. In this paper, we propose a computationally efficient framework for sub-6 GHz channel-mmWave beam mapping based on the knowledge distillation (KD) technique. We develop two compact student DL architectures based on individual and relational distillation strategies, which retain only a few hidden layers yet closely mimic the performance of large teacher DL models. Extensive simulations demonstrate that the proposed student models achieve the teacher's beam prediction accuracy and spectral efficiency while reducing trainable parameters and computational complexity by 99%.
Paper Structure (10 sections, 10 equations, 4 figures, 1 table, 1 algorithm)

This paper contains 10 sections, 10 equations, 4 figures, 1 table, 1 algorithm.

Figures (4)

  • Figure 1: Schematic of the proposed KD framework.
  • Figure 2: Validation loss of the considered models over training epochs.
  • Figure 3: Beam prediction accuracies of the compared DL models versus SNRs.
  • Figure 4: SE performance achieved by the considered DL models versus SNRs.