Leveraging Self-Supervised Learning for MIMO-OFDM Channel Representation and Generation
Zongxi Liu, Jiacheng Chen, Yunting Xu, Ting Ma, Jingbo Liu, Haibo Zhou, Dusit Niyato
TL;DR
This work tackles the overhead and variability challenges in MIMO-OFDM channels by learning a self-supervised, geometry-aware representation suitable for geolocation-based MIMO transmission. It introduces a Transformer-encoder based encoder to extract latent channel representations $F^{b,u,t}$ from unlabeled data, a conditional diffusion generator to sample new representations conditioned on geolocation, and a Transformer-encoder decoder to reconstruct channels for downstream precoding tasks. The approach is validated on a public ray-tracing DeepMIMO dataset, showing that the learned representations faithfully capture channel structure and that diffusion-based generation yields superior performance over baselines in geolocation-based MIMO transmission. This framework enables CSI-free, location-aware precoding in FD-RAN architectures, offering practical potential for 6G-practical transmission and sensing applications.
Abstract
In communications theory, the capacity of multiple input multiple output-orthogonal frequency division multiplexing (MIMO-OFDM) systems is fundamentally determined by wireless channels, which exhibit both diversity and correlation in spatial, frequency and temporal domains. It is further envisioned to exploit the inherent nature of channels, namely representation, to achieve geolocation-based MIMO transmission for 6G, exemplified by the fully-decoupled radio access network (FD-RAN). Accordingly, this paper first employs self-supervised learning to obtain channel representation from unlabeled channel, then proposes a channel generation assisted approach for determining MIMO precoding matrix solely based on geolocation. Specifically, we exploit the small-scale temporal domain variations of channels at a fixed geolocation, and design an ingenious pretext task tailored for contrastive learning. Then, a Transformer-based encoder is trained to output channel representations. We further develop a conditional diffusion generator to generate channel representations from geolocation. Finally, a Transformer-encoder-based decoder is utilized to reconstruct channels from generated representations, where the optimal channel is selected for calculating the precoding matrix for both single and dual BS transmission. We conduct experiments on a public ray-tracing channel dataset, and the extensive simulation results demonstrate the effectiveness of our channel representation method, and also showcase the performance improvement in geolocation-based MIMO transmission.
