Table of Contents
Fetching ...

Collaborative Intelligence for UAV-Satellite Network Slicing: Towards a Joint QoS-Energy-Fairness MADRL Optimization

Thanh-Dao Nguyen, Ngoc-Tan Nguyen, Thai-Duong Nguyen, Nguyen Van Huynh, Dinh-Hieu Tran, Symeon Chatzinotas

TL;DR

This work tackles resource management for UAV-satellite networks by proposing a hierarchical network-slicing framework that jointly optimizes UAV trajectories, transmit power, and spectrum allocation across eMBB, URLLC, and mMTC services. It models the problem as a Dec-POMDP and solves it with a MADDPG approach featuring a shared critic with multi-head attention, enabling cooperative learning under partial observability. The method is evaluated in a stand-alone NTN setup with LEO backhaul, showing up to 33% improvement in cumulative reward, up to 8x energy savings, and ~16% fairness gains over baselines. The results demonstrate the practical potential of cooperative multi-agent learning for efficient, fair, and QoS-aware UAV-satellite network slicing with hierarchical resource management.

Abstract

Non terrestrial networks are critical for achieving global 6G coverage, yet efficient resource management in aerial and space environments remains challenging due to limited onboard power and dynamic operational conditions. Network slicing offers a promising solution for spectrum optimization in UAV based systems serving heterogeneous service demands. For that, this paper proposes a hierarchical network slicing framework for UAV satellite integrated networks supporting eMBB, URLLC, and mMTC services. Specifically, we formulate a joint optimization of UAV trajectory, transmission power, and spectrum allocation as a decentralized partially observable Markov decision process that ensures quality of service while minimizing energy consumption and maximizing resource fairness. To address the computational intractability and partial observability, we develop a multi agent deep reinforcement learning solution under the centralized training and decentralized execution paradigm. In the proposed system, UAV agents act as distributed actors coordinated by a shared critic operating with multi head attention mechanism at a low Earth orbit satellite. Experimental results then demonstrate that our approach outperforms existing methods by up to 33% in cumulative reward while achieving superior energy efficiency and fairness.

Collaborative Intelligence for UAV-Satellite Network Slicing: Towards a Joint QoS-Energy-Fairness MADRL Optimization

TL;DR

This work tackles resource management for UAV-satellite networks by proposing a hierarchical network-slicing framework that jointly optimizes UAV trajectories, transmit power, and spectrum allocation across eMBB, URLLC, and mMTC services. It models the problem as a Dec-POMDP and solves it with a MADDPG approach featuring a shared critic with multi-head attention, enabling cooperative learning under partial observability. The method is evaluated in a stand-alone NTN setup with LEO backhaul, showing up to 33% improvement in cumulative reward, up to 8x energy savings, and ~16% fairness gains over baselines. The results demonstrate the practical potential of cooperative multi-agent learning for efficient, fair, and QoS-aware UAV-satellite network slicing with hierarchical resource management.

Abstract

Non terrestrial networks are critical for achieving global 6G coverage, yet efficient resource management in aerial and space environments remains challenging due to limited onboard power and dynamic operational conditions. Network slicing offers a promising solution for spectrum optimization in UAV based systems serving heterogeneous service demands. For that, this paper proposes a hierarchical network slicing framework for UAV satellite integrated networks supporting eMBB, URLLC, and mMTC services. Specifically, we formulate a joint optimization of UAV trajectory, transmission power, and spectrum allocation as a decentralized partially observable Markov decision process that ensures quality of service while minimizing energy consumption and maximizing resource fairness. To address the computational intractability and partial observability, we develop a multi agent deep reinforcement learning solution under the centralized training and decentralized execution paradigm. In the proposed system, UAV agents act as distributed actors coordinated by a shared critic operating with multi head attention mechanism at a low Earth orbit satellite. Experimental results then demonstrate that our approach outperforms existing methods by up to 33% in cumulative reward while achieving superior energy efficiency and fairness.

Paper Structure

This paper contains 29 sections, 11 equations, 4 figures, 1 algorithm.

Figures (4)

  • Figure 1: System model of standalone non-terrestrial UAV-Satellite network.
  • Figure 2: Two-level spectrum allocation framework
  • Figure 3: Evaluation of the MADDPG framework: (a) Training reward in 1M steps and (b) Reward comparison throughout 2000 simulation steps.
  • Figure 4: Breakdown of the total reward: (a) QoS reward, (b) Energy Penalty, and (c) Fairness reward.