HoloBeam: Learning Optimal Beamforming in Far-Field Holographic Metasurface Transceivers
Debamita Ghosh, Manjesh Kumar Hanawal, Nikola Zlatanova
TL;DR
HoloBeam tackles the challenge of learning optimal far-field beamforming for holographic metasurface transceivers under a fixed pilot budget. By discretizing two continuous phase-sh shifting parameters and exploiting the unimodal dependence of RSS on each parameter, it frames beamforming as a fixed-budget pure-exploration bandit problem and introduces a two-phase algorithm that learns one parameter at a time. The authors derive an exponential decay bound on misidentification probability and demonstrate through simulations that HoloBeam outperforms state-of-the-art pure-exploration methods in LOS HMT scenarios, with notable gains in throughput. The work suggests significant practical impact for low-cost, scalable mmWave/THz beamforming, while indicating future extensions to multimodal channels with NLOS components.
Abstract
Holographic Metasurface Transceivers (HMTs) are emerging as cost-effective substitutes to large antenna arrays for beamforming in Millimeter and TeraHertz wave communication. However, to achieve desired channel gains through beamforming in HMT, phase-shifts of a large number of elements need to be appropriately set, which is challenging. Also, these optimal phase-shifts depend on the location of the receivers, which could be unknown. In this work, we develop a learning algorithm using a {\it fixed-budget multi-armed bandit framework} to beamform and maximize received signal strength at the receiver for far-field regions. Our algorithm, named \Algo exploits the parametric form of channel gains of the beams, which can be expressed in terms of two {\it phase-shifting parameters}. Even after parameterization, the problem is still challenging as phase-shifting parameters take continuous values. To overcome this, {\it\HB} works with the discrete values of phase-shifting parameters and exploits their unimodal relations with channel gains to learn the optimal values faster. We upper bound the probability of {\it\HB} incorrectly identifying the (discrete) optimal phase-shift parameters in terms of the number of pilots used in learning. We show that this probability decays exponentially with the number of pilot signals. We demonstrate that {\it\HB} outperforms state-of-the-art algorithms through extensive simulations.
