Tiny Multi-Agent DRL for Twins Migration in UAV Metaverses: A Multi-Leader Multi-Follower Stackelberg Game Approach

Jiawen Kang; Yue Zhong; Minrui Xu; Jiangtian Nie; Jinbo Wen; Hongyang Du; Dongdong Ye; Xumin Huang; Dusit Niyato; Shengli Xie

Tiny Multi-Agent DRL for Twins Migration in UAV Metaverses: A Multi-Leader Multi-Follower Stackelberg Game Approach

Jiawen Kang, Yue Zhong, Minrui Xu, Jiangtian Nie, Jinbo Wen, Hongyang Du, Dongdong Ye, Xumin Huang, Dusit Niyato, Shengli Xie

TL;DR

A tiny machine learning-based Stackelberg game framework based on pruning techniques for efficient UT migration in UAV metaverses is proposed and a Tiny multiagent deep reinforcement learning (Tiny MADRL) algorithm is designed to obtain the tiny networks representing the optimal game solution.

Abstract

The synergy between Unmanned Aerial Vehicles (UAVs) and metaverses is giving rise to an emerging paradigm named UAV metaverses, which create a unified ecosystem that blends physical and virtual spaces, transforming drone interaction and virtual exploration. UAV Twins (UTs), as the digital twins of UAVs that revolutionize UAV applications by making them more immersive, realistic, and informative, are deployed and updated on ground base stations, e.g., RoadSide Units (RSUs), to offer metaverse services for UAV Metaverse Users (UMUs). Due to the dynamic mobility of UAVs and limited communication coverages of RSUs, it is essential to perform real-time UT migration to ensure seamless immersive experiences for UMUs. However, selecting appropriate RSUs and optimizing the required bandwidth is challenging for achieving reliable and efficient UT migration. To address the challenges, we propose a tiny machine learning-based Stackelberg game framework based on pruning techniques for efficient UT migration in UAV metaverses. Specifically, we formulate a multi-leader multi-follower Stackelberg model considering a new immersion metric of UMUs in the utilities of UAVs. Then, we design a Tiny Multi-Agent Deep Reinforcement Learning (Tiny MADRL) algorithm to obtain the tiny networks representing the optimal game solution. Specifically, the actor-critic network leverages the pruning techniques to reduce the number of network parameters and achieve model size and computation reduction, allowing for efficient implementation of Tiny MADRL. Numerical results demonstrate that our proposed schemes have better performance than traditional schemes.

Tiny Multi-Agent DRL for Twins Migration in UAV Metaverses: A Multi-Leader Multi-Follower Stackelberg Game Approach

TL;DR

Abstract

Paper Structure (23 sections, 6 theorems, 51 equations, 6 figures, 2 tables, 2 algorithms)

This paper contains 23 sections, 6 theorems, 51 equations, 6 figures, 2 tables, 2 algorithms.

Introduction
Related Work
Metaverses
UAV Twins
Resource Pricing Optimization in Metaverses
DRL with Pruning Techniques
System Model and Problem Formulation
System Model
Multi-leader Multi-Follower Stackelberg Game between RSUs and UAVs
UAVs' Bandwidth Demands in Stage II
RSUs' bandwidth selling prices in Stage I
Stackelberg Equilibrium Analysis
UAVs' optimal strategies in Stage II
RSUs' optimal strategies as equilibrium in Stage I
Tiny Multi-agent Deep Reinforcement Learning Algorithm
...and 8 more sections

Key Result

Proposition 1

For any given RSU $j$ with the bandwidth selling price $p_i^j$, the optimization problem of UAV $i$ is inherently convex. The optimal strategy dictating the bandwidth demand to be procured by UAV $i$ can be formulated as where $\check{b}_i^j$ and $\hat{b}_i^j$ represent the optimal strategy of UAV $i$ to RSU $j$ if the second constraint in Eq. (V_j) is inactive and active, respectively, which are

Figures (6)

Figure 1: A tiny learning-based game approach framework based on the pruning techniques. Note that UAVs seamlessly migrate their UTs from pre-migration RSUs to post-migration RSUs, ensuring that UMUs can consistently access and benefit from the metaverses services offered by the UTs.
Figure 2: The Tiny MADRL framework of the dynamic structured pruning algorithm for the Tiny MADRL network. Inactive neurons are represented by dashed circles, while deleted weights are denoted by dotted lines.
Figure 3: Convergence of the Tiny MADRL algorithm.
Figure 4: The strategies of RSUs and UAVs obtained by the Tiny MADRL algorithm.
Figure 5: Relation between the average reward of RSUs and the parameters.
...and 1 more figures

Theorems & Definitions (11)

Definition 1
Proposition 1
proof
Lemma 1
Theorem 1
proof
Theorem 2
proof
Lemma 2
Theorem 3
...and 1 more

Tiny Multi-Agent DRL for Twins Migration in UAV Metaverses: A Multi-Leader Multi-Follower Stackelberg Game Approach

TL;DR

Abstract

Tiny Multi-Agent DRL for Twins Migration in UAV Metaverses: A Multi-Leader Multi-Follower Stackelberg Game Approach

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (6)

Theorems & Definitions (11)