Table of Contents
Fetching ...

Timely Best Arm Identification in Restless Shared Networks

Mengqiu Zhou, Vincent Y. F. Tan, Meng Zhang

Abstract

Real-time status updating applications increasingly rely on networks of devices and edge nodes to maintain data freshness, as quantified by the age of information (AoI) metric. Given that edge computing nodes exhibit uncertain and time-varying dynamics, it is essential to identify the optimal edge node with high confidence and sample efficiency, even without prior knowledge of these dynamics, to ensure timely updates. To address this challenge, we introduce the first best arm identification (BAI) problem aimed at minimizing the long-term average AoI under a fixed confidence setting, framed within the context of a restless multi-armed bandit (RMAB) model. In this model, each arm evolves independently according to an unknown Markov chain over time, regardless of whether it is selected. To capture the temporal trajectories of AoI in the presence of unknown restless dynamics, we develop an age-aware LUCB algorithm that incorporates Markovian sampling. Additionally, we establish an instance-dependent upper bound on the sample complexity, which captures the difficulty of the problem as a function of the underlying Markov mixing behavior. Moreover, we derive an information-theoretic lower bound to characterize the fundamental challenges of the problem. We show that the sample complexity is influenced by the temporal correlation of the Markov dynamics, aligning with the intuition offered by the upper bound. Our numerical results show that, compared to existing benchmarks, the proposed scheme significantly reduces sampling costs, particularly under more stringent confidence levels.

Timely Best Arm Identification in Restless Shared Networks

Abstract

Real-time status updating applications increasingly rely on networks of devices and edge nodes to maintain data freshness, as quantified by the age of information (AoI) metric. Given that edge computing nodes exhibit uncertain and time-varying dynamics, it is essential to identify the optimal edge node with high confidence and sample efficiency, even without prior knowledge of these dynamics, to ensure timely updates. To address this challenge, we introduce the first best arm identification (BAI) problem aimed at minimizing the long-term average AoI under a fixed confidence setting, framed within the context of a restless multi-armed bandit (RMAB) model. In this model, each arm evolves independently according to an unknown Markov chain over time, regardless of whether it is selected. To capture the temporal trajectories of AoI in the presence of unknown restless dynamics, we develop an age-aware LUCB algorithm that incorporates Markovian sampling. Additionally, we establish an instance-dependent upper bound on the sample complexity, which captures the difficulty of the problem as a function of the underlying Markov mixing behavior. Moreover, we derive an information-theoretic lower bound to characterize the fundamental challenges of the problem. We show that the sample complexity is influenced by the temporal correlation of the Markov dynamics, aligning with the intuition offered by the upper bound. Our numerical results show that, compared to existing benchmarks, the proposed scheme significantly reduces sampling costs, particularly under more stringent confidence levels.
Paper Structure (29 sections, 8 theorems, 62 equations, 4 figures, 2 algorithms)

This paper contains 29 sections, 8 theorems, 62 equations, 4 figures, 2 algorithms.

Key Result

Lemma 1

When $P$ represent an irreducible transition probability matrix on the finite state space $\mathcal{S}$ satisfying Assumption assump:P, for any $\theta_a$, the matrix $P_{\theta_a}$ is irreducible and positive recurrent.

Figures (4)

  • Figure 1: The system model. Each edge node is modeled as an arm in a restless multi-armed bandit framework, with unknown congestion dynamics.
  • Figure 2: Age of information $\Delta(t)$ evolution in time under zero-wait policy.
  • Figure 3: The structure of the Markov regeneration sampling strategy.
  • Figure 4: Numerical evaluations of the performance comparison with (a) the confidence level $\delta$, (b) the Markov mixing behavior, and (c) the instances.

Theorems & Definitions (10)

  • Lemma 1
  • Lemma 2
  • Definition 1: Pseudo Spectral Gap
  • Proposition 1
  • Lemma 3
  • Corollary 1
  • Theorem 1: Sample Complexity
  • Theorem 2: Lower Bound
  • Lemma 4
  • Remark 1