Table of Contents
Fetching ...

Beyond Freshness and Semantics: A Coupon-Collector Framework for Effective Status Updates

Youssef Ahmed, Arnob Ghosh, Chih-Chun Wang, Ness B. Shroff

Abstract

For status update systems operating over unreliable energy-constrained wireless channels, we address Weaver's long-standing Level-C question: do my packets actually improve the plant's behavior? Each fresh sample carries a stochastic expiration time -- governed by the plant's instability dynamics -- after which the information becomes useless for control. Casting the problem as a coupon-collector variant with expiring coupons, we (i) formulate a two-dimensional average-reward MDP, (ii) prove that the optimal schedule is doubly thresholded in the receiver's freshness timer and the sender's stored lifetime, (iii) derive a closed-form policy for deterministic lifetimes, and (iv) design a Structure-Aware Q-learning algorithm (SAQ) that learns the optimal policy without knowing the channel success probability or lifetime distribution. Simulations validate our theoretical predictions: SAQ matches optimal Value Iteration performance while converging significantly faster than baseline Q-learning, and expiration-aware scheduling achieves up to 50% higher reward than age-based baselines by adapting transmissions to state-dependent urgency -- thereby delivering Level-C effectiveness under tight resource constraints.

Beyond Freshness and Semantics: A Coupon-Collector Framework for Effective Status Updates

Abstract

For status update systems operating over unreliable energy-constrained wireless channels, we address Weaver's long-standing Level-C question: do my packets actually improve the plant's behavior? Each fresh sample carries a stochastic expiration time -- governed by the plant's instability dynamics -- after which the information becomes useless for control. Casting the problem as a coupon-collector variant with expiring coupons, we (i) formulate a two-dimensional average-reward MDP, (ii) prove that the optimal schedule is doubly thresholded in the receiver's freshness timer and the sender's stored lifetime, (iii) derive a closed-form policy for deterministic lifetimes, and (iv) design a Structure-Aware Q-learning algorithm (SAQ) that learns the optimal policy without knowing the channel success probability or lifetime distribution. Simulations validate our theoretical predictions: SAQ matches optimal Value Iteration performance while converging significantly faster than baseline Q-learning, and expiration-aware scheduling achieves up to 50% higher reward than age-based baselines by adapting transmissions to state-dependent urgency -- thereby delivering Level-C effectiveness under tight resource constraints.

Paper Structure

This paper contains 30 sections, 10 theorems, 46 equations, 5 figures, 2 tables, 1 algorithm.

Key Result

Theorem 1

Under Assumptions as:predictable-tau--as:feasibility, if the transmission cost satisfies $\beta c \leq 1$ and $\beta c < \bar{T}$, where $\bar{T} \triangleq \lim_{K \to \infty} \frac{1}{K} \sum_{k=1}^{K} T^{\mathrm{MSE}}_{m_k}$ is the limiting average expiration time along the JIT transmission epoch

Figures (5)

  • Figure 1: Impact of observation age and noise $\sigma_{\max}$ on controller performance.
  • Figure 2: Expiration Analysis and Scheduling Performance.
  • Figure 3: Coupon-collector model with expiring samples.
  • Figure 4: Optimal policy regions for varying $p_s$ values ($K=20$, $c/r=0.5$). Blue (shaded) regions: the sender transmits ($a=1$); white regions: the sender remains silent ($a=0$). The diagonal $T_r = T_s$ marks the boundary above which the receiver already holds fresher data, so transmission is never optimal (Theorem \ref{['thm:global_structure']} (ii)). As $p_s$ decreases from left to right, the send region expands to compensate for higher packet-loss probability.
  • Figure 5: Average reward vs. system parameters for Baseline Q, SAQ, and optimal VI.

Theorems & Definitions (24)

  • Definition 1: MSE-based expiration time
  • Theorem 1: Optimality of JIT policy
  • proof
  • Remark 1: Interpretation of the conditions
  • Remark 2
  • Remark 3: On the tolerance parameters
  • Lemma 1
  • proof
  • Theorem 2
  • proof
  • ...and 14 more