Transformer-Based Sparse CSI Estimation for Non-Stationary Channels
Muhammad Ahmed Mohsin, Muhammad Umer, Ahsan Bilal, Hassan Rizwan, Sagnik Bhattacharya, Muhammad Ali Jamshed, John M. Cioffi
TL;DR
This work tackles CSI estimation under non-stationary, high-m mobility wireless channels by introducing a pilot-aided Flash-Attention Transformer that unifies pilot acquisition with data-driven CSI reconstruction. Through patch-wise self-attention and a physics-aware composite loss, the model learns long-range time–frequency dependencies while conditioning on the pilot mask to adapt to Doppler and multipath dynamics. In standardized 3GPP NR MIMO-OFDM scenarios, it outperforms LMMSE, LSTM, and LDAMP baselines by about $13~\mathrm{dB}$ in phase-invariant NMSE and achieves markedly lower BER while reducing pilot overhead by $16\times$, approaching the oracle dense-pilot performance. This approach significantly enhances spectral efficiency and link reliability in non-stationary 5G and beyond-5G networks by enabling high-fidelity CSI reconstruction from sparse pilots.
Abstract
Accurate and efficient estimation of Channel State Information (CSI) is critical for next-generation wireless systems operating under non-stationary conditions, where user mobility, Doppler spread, and multipath dynamics rapidly alter channel statistics. Conventional pilot aided estimators incur substantial overhead, while deep learning approaches degrade under dynamic pilot patterns and time varying fading. This paper presents a pilot-aided Flash-Attention Transformer framework that unifies model-driven pilot acquisition with data driven CSI reconstruction through patch-wise self-attention and a physics aware composite loss function enforcing phase alignment, correlation consistency, and time frequency smoothness. Under a standardized 3GPP NR configuration, the proposed framework outperforms LMMSE and LSTM baselines by approximately 13 dB in phase invariant normalized mean-square error (NMSE) with markedly lower bit-error rate (BER), while reducing pilot overhead by 16 times. These results demonstrate that attention based architectures enable reliable CSI recovery and enhanced spectral efficiency without compromising link quality, addressing a fundamental bottleneck in adaptive, low-overhead channel estimation for non-stationary 5G and beyond-5G networks.
