Reinforcement Learning for Rate Maximization in IRS-aided OWC Networks
Ahrar N. Hamad, Ahmad Adnan Qidan, Taisir E. H. Elgorashi, Jaafar M. H. Elmirghani
TL;DR
This work tackles indoor IRS-aided visible-light communication (VLC) to overcome line-of-sight blockages by jointly allocating APs and IRS mirrors. It casts the problem as a Markov decision process and applies model-free reinforcement learning (Q-learning and SARSA) to achieve real-time, near-optimal AP-mirror associations without prior system knowledge, achieving performance close to a MILP benchmark. The results show substantial sum-rate gains (up to 66% over no IRS and 45% over distance-based IRS) and demonstrate IRS effectiveness under blockage scenarios, including gains with single and multiple mirror arrays. The study demonstrates the practicality of RL for dynamic, blockage-resilient IRS-aided OWC systems with clear implications for real-time network control in future indoor wireless networks.
Abstract
Optical wireless communication (OWC) is envisioned as one of the main enabling technologies of 6G networks, complementing radio frequency (RF) systems to provide high data rates. One of the crucial issues in indoor OWC is service interruptions due to blockages that obstruct the line of sight (LoS) between users and their access points (APs). Recently, reflecting surfaces referred to as intelligent reflecting surfaces (IRSs) have been considered to provide improved connectivity in OWC systems by reflecting AP signals toward users. In this study, we investigate the integration of IRSs into an indoor OWC system to improve the sum rate of the users and to ensure service continuity. We formulate an optimization problem for sum rate maximization, where the allocation of both APs and mirror elements of IRSs to users is determined to enhance the aggregate data rate. Moreover, reinforcement learning (RL) algorithms, specifically Q-learning and SARSA algorithms, are proposed to provide real-time solutions with low complexity and without prior system knowledge. The results show that the RL algorithms achieve near-optimal solutions that are close to the solutions of mixed integer linear programming (MILP). The results also show that the proposed scheme achieves up to a 45% increase in data rate compared to a traditional scheme that optimizes only the allocation of APs while the mirror elements are assigned to users based on the distance.
