Learning-Based User Association for MmWave Vehicular Networks With Kernelized Contextual Bandits
Xiaoyang He, Xiaoxia Huang
TL;DR
The paper tackles mmWave vehicular network user association under rapidly varying channels by framing the problem as a multi-agent contextual bandit, where rewards are nonlinear functions of context-derived features. It introduces DK-UCB, which maps contexts into an RKHS via a novel mmWave-aware kernel that multiplies four similarity components (blockage, path loss, Doppler, interference) to estimate rewards without extra channel measurements. A synchronization mechanism triggers information exchange only during substantial explorations, balancing learning speed against communication overhead. Theoretical regret and communication-cost bounds are provided, and numerical results on realistic vehicular scenarios show improved learning efficiency and higher average rates compared with baselines, especially at higher vehicle densities. Overall, the approach offers a practical, scalable solution for dynamic, location-aware user association in mmWave vehicular networks.
Abstract
Vehicles require timely channel conditions to determine the base station (BS) to communicate with, but it is costly to estimate the fast-fading mmWave channels frequently. Without additional channel estimations, the proposed Distributed Kernelized Upper Confidence Bound (DK-UCB) algorithm estimates the current instantaneous transmission rates utilizing past contexts, such as the vehicle's location and velocity, along with past instantaneous transmission rates. To capture the nonlinear mapping from a context to the instantaneous transmission rate, DK-UCB maps a context into the reproducing kernel Hilbert space (RKHS) where a linear mapping becomes observable. To improve estimation accuracy, we propose a novel kernel function in RKHS which incorporates the propagation characteristics of the mmWave signals. Moreover, DK-UCB encourages a vehicle to share necessary information when it has conducted significant explorations, which speeds up the learning process while maintaining affordable communication costs.
