Table of Contents
Fetching ...

Beam Prediction based on Large Language Models

Yucheng Sheng, Kai Huang, Le Liang, Peng Liu, Shi Jin, Geoffrey Ye Li

TL;DR

This work reframes mmWave beam prediction as a time-series forecasting problem and leverages embedding-visible large language models (LLMs) to predict future optimal beams without fine-tuning the backbone. By aggregating historical beams and AoD through cross-variable attention and converting the fused features into text-based prototypes, the approach aligns wireless data with LLM pre-training. A novel Patch-Reprogramming mechanism and Prompt-as-Prefix (PaP) strategy enable the LLM to reason over time-series inputs using a reduced vocabulary of text prototypes, with mean-squared error loss guiding multi-step forecasts: $L = \frac{1}{H} \sum_{n=1}^{H} \|\hat{Y}_n - Y_n\|_2^2$. Empirical results on DeepMIMO show BP-LLM superior robustness and generalization compared to LSTM baselines across speed, BS, and center-frequency variations, validating the potential of LLMs for resilient wireless prediction tasks. The findings suggest that PaP and time-series patching can unlock broader applications of LLMs in wireless communication systems.

Abstract

In this letter, we use large language models (LLMs) to develop a high-performing and robust beam prediction method. We formulate the millimeter wave (mmWave) beam prediction problem as a time series forecasting task, where the historical observations are aggregated through cross-variable attention and then transformed into text-based representations using a trainable tokenizer. By leveraging the prompt-as-prefix (PaP) technique for contextual enrichment, our method harnesses the power of LLMs to predict future optimal beams. Simulation results demonstrate that our LLM-based approach outperforms traditional learning-based models in prediction accuracy as well as robustness, highlighting the significant potential of LLMs in enhancing wireless communication systems.

Beam Prediction based on Large Language Models

TL;DR

This work reframes mmWave beam prediction as a time-series forecasting problem and leverages embedding-visible large language models (LLMs) to predict future optimal beams without fine-tuning the backbone. By aggregating historical beams and AoD through cross-variable attention and converting the fused features into text-based prototypes, the approach aligns wireless data with LLM pre-training. A novel Patch-Reprogramming mechanism and Prompt-as-Prefix (PaP) strategy enable the LLM to reason over time-series inputs using a reduced vocabulary of text prototypes, with mean-squared error loss guiding multi-step forecasts: . Empirical results on DeepMIMO show BP-LLM superior robustness and generalization compared to LSTM baselines across speed, BS, and center-frequency variations, validating the potential of LLMs for resilient wireless prediction tasks. The findings suggest that PaP and time-series patching can unlock broader applications of LLMs in wireless communication systems.

Abstract

In this letter, we use large language models (LLMs) to develop a high-performing and robust beam prediction method. We formulate the millimeter wave (mmWave) beam prediction problem as a time series forecasting task, where the historical observations are aggregated through cross-variable attention and then transformed into text-based representations using a trainable tokenizer. By leveraging the prompt-as-prefix (PaP) technique for contextual enrichment, our method harnesses the power of LLMs to predict future optimal beams. Simulation results demonstrate that our LLM-based approach outperforms traditional learning-based models in prediction accuracy as well as robustness, highlighting the significant potential of LLMs in enhancing wireless communication systems.
Paper Structure (7 sections, 10 equations, 7 figures)

This paper contains 7 sections, 10 equations, 7 figures.

Figures (7)

  • Figure 1: Illustration of LLM-based mmWave prediction. Given the past optimal beam indices and AoDs, we first patch and fuse them using cross-attention. Then after patch reprogramming, they form several discrete tokens together with PaP. The output patches from the LLMs are projected to generate the future optimal beam indices.
  • Figure 2: Prediction performance of the proposed method, BP-LLM, compared with other learning-based methods, averaged across test velocities of 5, 10, 15, 20 m/s.
  • Figure 3: Performance of the proposed method compared with other learning-based methods under the mismatched BS settings.
  • Figure 4: Performance of the proposed method compared with other learning-based methods under the mismatched center frequency.
  • Figure 5: Performance of the proposed system with different antennas.
  • ...and 2 more figures