Enhancing Information Freshness: An AoI Optimized Markov Decision Process Dedicated In the Underwater Task

Jingzehua Xu; Yimian Ding; Yiyuan Yang; Guanwen Xie; Shuai Zhang

Enhancing Information Freshness: An AoI Optimized Markov Decision Process Dedicated In the Underwater Task

Jingzehua Xu, Yimian Ding, Yiyuan Yang, Guanwen Xie, Shuai Zhang

TL;DR

This study presents an AoI optimized Markov decision process (AoI-MDP) to improve the performance of underwater tasks and introduces wait time in the action space, and integrates AoI with reward functions to achieve joint optimization of information freshness and decision-making for AUVs leveraging RL for training.

Abstract

Ocean exploration utilizing autonomous underwater vehicles (AUVs) via reinforcement learning (RL) has emerged as a significant research focus. However, underwater tasks have mostly failed due to the observation delay caused by acoustic communication in the Internet of underwater things. In this study, we present an AoI optimized Markov decision process (AoI-MDP) to improve the performance of underwater tasks. Specifically, AoI-MDP models observation delay as signal delay through statistical signal processing, and includes this delay as a new component in the state space. Additionally, we introduce wait time in the action space, and integrate AoI with reward functions to achieve joint optimization of information freshness and decision-making for AUVs leveraging RL for training. Finally, we apply this approach to the multi-AUV data collection task scenario as an example. Simulation results highlight the feasibility of AoI-MDP, which effectively minimizes AoI while showcasing superior performance in the task. To accelerate relevant research in this field, we have made the simulation codes available as open-source.

Enhancing Information Freshness: An AoI Optimized Markov Decision Process Dedicated In the Underwater Task

TL;DR

Abstract

Paper Structure (8 sections, 8 equations, 5 figures, 1 table, 1 algorithm)

This paper contains 8 sections, 8 equations, 5 figures, 1 table, 1 algorithm.

Introduction
Methodology
AoI Optimized Markov Decision Process
Observation Delay and Information Modeling
Experiments
Task Description and Settings
Experiment Results and Analysis
Conclusion

Figures (5)

Figure 1: Illustration of the AoI model, which is defined using a sawtooth piecewise function, where $Y_i$ and $Z_i$ denote the observation delay and wait time at time $i$, respectively.
Figure 2: Illustration of the azimuth and time delay estimation.
Figure 3: Comparison of experimental results of RL training based on AoI-MDP and standard MDP.
Figure 4: Comparison of experimental results using online and offline RL algorithms based on AoI-MDP.
Figure 5: The AUV trajectories using the expert policy trained via SAC algorithm based on AoI-MDP and standard MDP.

Enhancing Information Freshness: An AoI Optimized Markov Decision Process Dedicated In the Underwater Task

TL;DR

Abstract

Enhancing Information Freshness: An AoI Optimized Markov Decision Process Dedicated In the Underwater Task

Authors

TL;DR

Abstract

Table of Contents

Figures (5)