Learning Optimal Scheduling Policy for Remote State Estimation under Uncertain Channel Condition
Shuang Wu, Xiaoqiang Ren, Qing-Shan Jia, Karl Henrik Johansson, Ling Shi
TL;DR
The paper studies optimal sensor scheduling for remote state estimation when the channel dropout rate $r_s$ is unknown. It shows that the $Q$-factor is monotone and submodular, leading to threshold-like (and randomized-threshold) optimal policies under costly and constrained communication, respectively. To handle unknown channels, it develops two complementary learning frameworks: (i) stochastic approximation-based Q-learning with structural enhancements and (ii) parameter learning that estimates $r_s$ and plugs it into analytic policy formulas, with rigorous convergence guarantees for both. Numerical experiments demonstrate faster convergence with structured Q-learning, adaptability to time-varying channels, and favorable trade-offs compared to direct parameter-based control. Collectively, the work provides scalable, structure-exploiting methods for remote state estimation with uncertain channels and lays groundwork for extension to more complex channel models or multi-sensor setups.
Abstract
We consider optimal sensor scheduling with unknown communication channel statistics. We formulate two types of scheduling problems with the communication rate being a soft or hard constraint, respectively. We first present some structural results on the optimal scheduling policy using dynamic programming and assuming the channel statistics is known. We prove that the Q-factor is monotonic and submodular, which leads to the threshold-like structures in both types of problems. Then we develop a stochastic approximation and parameter learning frameworks to deal with the two scheduling problems with unknown channel statistics. We utilize their structures to design specialized learning algorithms. We prove the convergence of these algorithms. Performance improvement compared with the standard Q-learning algorithm is shown through numerical examples.
