Table of Contents
Fetching ...

Semi-Markov Decision Process Framework for Age of Incorrect Information Minimization

Ismail Cosandal, Sennur Ulukus, Nail Akar

TL;DR

The paper tackles minimizing AoII-based penalties together with transmission costs in a remote estimation setting with a DTMC source and a general discrete-phase-type forward channel. It casts the problem as a semi-Markov decision process using an estimation-dependent multi-threshold policy and develops the dual-regime AMC/PH frameworks (DR-AMC/DR-DPH) to compute all SMDP parameters efficiently. Policy iteration yields the optimal multi-threshold policy, and numerical results show substantial gains over single-threshold and random policies, with convergence to always-transmit as the transmission cost weight vanishes. The approach provides a scalable, analyzable method for semantic-aware freshness optimization in complex channel models.

Abstract

For a remote estimation system, we study age of incorrect information (AoII), which is a recently proposed semantic-aware freshness metric. In particular, we assume an information source observing a discrete-time finite-state Markov chain (DTMC) and employing push-based transmissions of status update packets towards the monitor which is tasked with remote estimation of the source. The source-to-monitor channel delay is assumed to have a general discrete-time phase-type (DPH) distribution, whereas the zero-delay reverse channel ensures that the source has perfect information on AoII and the remote estimate. A multi-threshold transmission policy is employed where packet transmissions are initiated when the AoII process exceeds a threshold which may be different for each estimation value. In this general setting, our goal is to minimize the weighted sum of time average of an arbitrary function of AoII and estimation, and transmission costs, by suitable choice of the thresholds. We formulate the problem as a semi-Markov decision process (SMDP) with the same state-space as the original DTMC to obtain the optimum multi-threshold policy whereas the parameters of the SMDP are obtained by using a novel stochastic tool called dual-regime absorbing Markov chain (DR-AMC), and its corresponding absorption time distribution named as dual-regime DPH (DR-DPH).

Semi-Markov Decision Process Framework for Age of Incorrect Information Minimization

TL;DR

The paper tackles minimizing AoII-based penalties together with transmission costs in a remote estimation setting with a DTMC source and a general discrete-phase-type forward channel. It casts the problem as a semi-Markov decision process using an estimation-dependent multi-threshold policy and develops the dual-regime AMC/PH frameworks (DR-AMC/DR-DPH) to compute all SMDP parameters efficiently. Policy iteration yields the optimal multi-threshold policy, and numerical results show substantial gains over single-threshold and random policies, with convergence to always-transmit as the transmission cost weight vanishes. The approach provides a scalable, analyzable method for semantic-aware freshness optimization in complex channel models.

Abstract

For a remote estimation system, we study age of incorrect information (AoII), which is a recently proposed semantic-aware freshness metric. In particular, we assume an information source observing a discrete-time finite-state Markov chain (DTMC) and employing push-based transmissions of status update packets towards the monitor which is tasked with remote estimation of the source. The source-to-monitor channel delay is assumed to have a general discrete-time phase-type (DPH) distribution, whereas the zero-delay reverse channel ensures that the source has perfect information on AoII and the remote estimate. A multi-threshold transmission policy is employed where packet transmissions are initiated when the AoII process exceeds a threshold which may be different for each estimation value. In this general setting, our goal is to minimize the weighted sum of time average of an arbitrary function of AoII and estimation, and transmission costs, by suitable choice of the thresholds. We formulate the problem as a semi-Markov decision process (SMDP) with the same state-space as the original DTMC to obtain the optimum multi-threshold policy whereas the parameters of the SMDP are obtained by using a novel stochastic tool called dual-regime absorbing Markov chain (DR-AMC), and its corresponding absorption time distribution named as dual-regime DPH (DR-DPH).

Paper Structure

This paper contains 13 sections, 2 theorems, 12 equations, 3 figures, 2 tables.

Key Result

Lemma 1

The $m$th factorial moment of $T$, namely $\nu_m=\mathbb{E}[T^{\underline{m}}], \ T^{\underline{m}}=T(T-1)(T-2) \cdots (T-m+1)$graham1994concrete, is given in closed form in eq:pow. Additionally, the ordinary moments of $T$, denoted by $\mu_T(m)=\mathbb{E}[T^{{m}}]$, can be obtained from the factori where $S(m,k)$ is the Stirling number of the second kind.

Figures (3)

  • Figure 1: The remote estimation system involving the source process $X_t$, the monitor process $\hat{X}_t$, and the forward channel modeled by DPH($\bm{\gamma},\bm{G}$). The monitor updates its estimation with the received updates (marked with dashed circles).
  • Figure 2: A sample path for the AoII cost throughout two complete cycles, when $\tau_2=2$, $\tau_3=1$, and AoII penalty functions $f_2(x)=x^2$, $f_3(x)=x$ are used. $P_t$ amounts to the phase of the channel process in case of a transmission at time $t$. Otherwise, the channel is idle.
  • Figure 3: Comparison of benchmark policies with proposed SMDP policy with varying $\lambda$ for a) scenario I b) scenario II.

Theorems & Definitions (3)

  • Lemma 1
  • Definition 1: Embedded Points
  • Lemma 2: Polynomial AoII Penalty Functions