An MDP Model for Censoring in Harvesting Sensors: Optimal and Approximated Solutions

Jesus Fernandez-Bes; Jesus Cid-Sueiro; Antonio G. Marques

An MDP Model for Censoring in Harvesting Sensors: Optimal and Approximated Solutions

Jesus Fernandez-Bes, Jesus Cid-Sueiro, Antonio G. Marques

TL;DR

This paper studies censoring policies for energy-harvesting wireless sensors by formulating the problem as an infinite-horizon Markov Decision Process and proving that, under reasonable battery dynamics, the optimal policy is a battery-dependent threshold on message importance. It introduces model-based stochastic approximations (SAP) that leverage this threshold structure and show faster convergence and lower complexity than Q-learning, with robustness to non-stationarities. The authors derive analytical characterizations of the optimal policy, discuss steady-state battery distributions, and present practical algorithms including an adaptive balanced transmitter. Numerical experiments in both single-hop and multi-hop networks demonstrate that energy-dependent censoring policies can significantly outperform balanced and non-selective approaches, especially when harvested energy is scarce. The work provides scalable, implementable strategies for energy-aware censoring in dynamic wireless sensor networks, with potential impact on green communications and distributed sensing applications.

Abstract

In this paper, we propose a novel censoring policy for energy-efficient transmissions in energy-harvesting sensors. The problem is formulated as an infinite-horizon Markov Decision Process (MDP). The objective to be optimized is the expected sum of the importance (utility) of all transmitted messages. Assuming that such importance can be evaluated at the transmitting node, we show that, under certain conditions on the battery model, the optimal censoring policy is a threshold function on the importance value. Specifically, messages are transmitted only if their importance is above a threshold whose value depends on the battery level. Exploiting this property, we propose a model-based stochastic scheme that approximates the optimal solution, with less computational complexity and faster convergence speed than a conventional Q-learning algorithm. Numerical experiments in single-hop and multi-hop networks confirm the analytical advantages of the proposed scheme.

An MDP Model for Censoring in Harvesting Sensors: Optimal and Approximated Solutions

TL;DR

Abstract

An MDP Model for Censoring in Harvesting Sensors: Optimal and Approximated Solutions

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (11)