Table of Contents
Fetching ...

Goal-oriented Estimation of Multiple Markov Sources in Resource-constrained Systems

Jiping Luo, Nikolaos Pappas

TL;DR

We address remote estimation of $M$ Markov sources in resource-constrained networks with a one-slot transmission delay and introduce the cost of actuation error (CAE) as the key performance metric. The problem is formulated as an average-cost CMDP and tackled via two transformations: Lagrangian relaxation yielding a lambda-optimal policy and Lyapunov drift converting the constraint into a virtual-queue stability problem, leading to a drift-plus-penalty objective. Two policies are developed: a low-complexity DPP policy for known statistics and a model-free LO-DRL policy based on PPO for unknown environments, both leveraging the one-slot expected CAE to decide sampling actions. Simulations demonstrate substantial CAE reductions and suppressed uninformative transmissions across single and multiple source setups, with LO-DRL delivering superior performance in uncertain or poor-channel conditions. The approach offers a practical, semantics-aware framework for resource-limited networked control systems with broad applicability to remote actuation tasks.

Abstract

This paper investigates goal-oriented communication for remote estimation of multiple Markov sources in resource-constrained networks. An agent decides the updating times of the sources and transmits the packet to a remote destination over an unreliable channel with delay. The destination is tasked with source reconstruction for actuation. We utilize the metric \textit{cost of actuation error} (CAE) to capture the state-dependent actuation costs. We aim for a sampling policy that minimizes the long-term average CAE subject to an average resource constraint. We formulate this problem as an average-cost constrained Markov Decision Process (CMDP) and relax it into an unconstrained problem by utilizing \textit{Lyapunov drift} techniques. Then, we propose a low-complexity \textit{drift-plus-penalty} (DPP) policy for systems with known source/channel statistics and a Lyapunov optimization-based deep reinforcement learning (LO-DRL) policy for unknown environments. Our policies significantly reduce the number of uninformative transmissions by exploiting the timing of the important information.

Goal-oriented Estimation of Multiple Markov Sources in Resource-constrained Systems

TL;DR

We address remote estimation of Markov sources in resource-constrained networks with a one-slot transmission delay and introduce the cost of actuation error (CAE) as the key performance metric. The problem is formulated as an average-cost CMDP and tackled via two transformations: Lagrangian relaxation yielding a lambda-optimal policy and Lyapunov drift converting the constraint into a virtual-queue stability problem, leading to a drift-plus-penalty objective. Two policies are developed: a low-complexity DPP policy for known statistics and a model-free LO-DRL policy based on PPO for unknown environments, both leveraging the one-slot expected CAE to decide sampling actions. Simulations demonstrate substantial CAE reductions and suppressed uninformative transmissions across single and multiple source setups, with LO-DRL delivering superior performance in uncertain or poor-channel conditions. The approach offers a practical, semantics-aware framework for resource-limited networked control systems with broad applicability to remote actuation tasks.

Abstract

This paper investigates goal-oriented communication for remote estimation of multiple Markov sources in resource-constrained networks. An agent decides the updating times of the sources and transmits the packet to a remote destination over an unreliable channel with delay. The destination is tasked with source reconstruction for actuation. We utilize the metric \textit{cost of actuation error} (CAE) to capture the state-dependent actuation costs. We aim for a sampling policy that minimizes the long-term average CAE subject to an average resource constraint. We formulate this problem as an average-cost constrained Markov Decision Process (CMDP) and relax it into an unconstrained problem by utilizing \textit{Lyapunov drift} techniques. Then, we propose a low-complexity \textit{drift-plus-penalty} (DPP) policy for systems with known source/channel statistics and a Lyapunov optimization-based deep reinforcement learning (LO-DRL) policy for unknown environments. Our policies significantly reduce the number of uninformative transmissions by exploiting the timing of the important information.
Paper Structure (16 sections, 2 theorems, 23 equations, 5 figures, 1 algorithm)

This paper contains 16 sections, 2 theorems, 23 equations, 5 figures, 1 algorithm.

Key Result

Lemma 1

The one-slot expected CAE $\bar{\Delta}_t$ is given by

Figures (5)

  • Figure 1: Remote state estimation of multiple Markov sources.
  • Figure 2: The average CAE vs. $p_s$ for the different policies.
  • Figure 3: The transmission frequency vs. $p_s$ for the different policies.
  • Figure 4: Performance comparison of the DPP and the LO-DRL policies.
  • Figure 5: Average CAE vs. different number of sources for different policies.

Theorems & Definitions (8)

  • Remark 1
  • Remark 2
  • Remark 3
  • Lemma 1
  • proof
  • Theorem 1
  • proof
  • Remark 4