Graph Attention Reinforcement Learning for Multicast Routing and Age-Optimal Scheduling

Yanning Zhang; Guocheng Liao; Shengbin Cao; Ning Yang; Nikolaos Pappas; Meng Zhang

Graph Attention Reinforcement Learning for Multicast Routing and Age-Optimal Scheduling

Yanning Zhang, Guocheng Liao, Shengbin Cao, Ning Yang, Nikolaos Pappas, Meng Zhang

TL;DR

This work tackles the problem of minimizing the Age of Information (AoI) in dynamic multicast networks by jointly optimizing multicast routing and scheduling. It introduces a cross-layer, hierarchical reinforcement learning framework that decomposes the problem into a scheduling subproblem and a tree-generating subproblem, leveraging Normalized Graph Attention (NGAT) embeddings and a Tree Generator-based Multicast Scheduling (TGMS) algorithm. The NGAT guarantees contraction properties to ensure stable learning, while TGMS solves the Steiner Tree-like routing efficiently and generalizes well to unseen topologies. Empirical results show up to $9.85\times$ computational speedups and meaningful AoI reductions under energy constraints, demonstrating practical potential for real-time, energy-aware multicast in SDN-enabled networks.

Abstract

Multicast routing is essential for real-time group applications, such as video streaming, virtual reality, and metaverse platforms, where the Age of Information (AoI) acts as a crucial metric to assess information timeliness. This paper studies dynamic multicast networks with the objective of minimizing the expected average Age of Information (AoI) by jointly optimizing multicast routing and scheduling. The main challenges stem from the intricate coupling between routing and scheduling decisions, the inherent complexity of multicast operations, and the graph representation. We first decompose the original problem into two subtasks amenable to hierarchical reinforcement learning (RL) methods. We propose the first RL framework to address the multicast routing problem, also known as the Steiner Tree problem, by incorporating graph embedding and the successive addition of nodes and links. For graph embedding, we propose the Normalized Graph Attention mechanism (NGAT) framework with a proven contraction mapping property, enabling effective graph information capture and superior generalization within the hierarchical RL framework. We validate our framework through experiments on four datasets, including the real-world AS-733 dataset. The results demonstrate that our proposed scheme can be up to 9.85 times more computationally efficient than traditional multicast routing algorithms, achieving approximation ratios of 1.1-1.3 that are not only comparable to state-of-the-art (SOTA) methods but also highlight its superior generalization capabilities, performing effectively on unseen and more complex tasks. Additionally, our age-optimal TGMS algorithm reduces the average weighted Age of Information (AoI) by 25.6% and the weighted peak age by 29.2% under low-energy scenarios.

Graph Attention Reinforcement Learning for Multicast Routing and Age-Optimal Scheduling

TL;DR

computational speedups and meaningful AoI reductions under energy constraints, demonstrating practical potential for real-time, energy-aware multicast in SDN-enabled networks.

Abstract

Paper Structure (32 sections, 11 theorems, 74 equations, 10 figures, 5 tables, 3 algorithms)

This paper contains 32 sections, 11 theorems, 74 equations, 10 figures, 5 tables, 3 algorithms.

Introduction
Background and Motivations
Key Challenges and Solution Approach
Related Works
Multicast Routing
Age of Information
System Model
System Overview
Multicast Process
Age of Information
Energy Consumption
Age-minimal Multicast Scheduling and Routing
Problem Reformulation
Problem Decomposition
Scheduling MDP
...and 17 more sections

Key Result

Lemma 1

Any two vertices in a tree can be connected by a unique simple path.

Figures (10)

Figure 1: An example of a multicast network. The nodes are connected by links with different costs. At the beginning of each time slot, the source generates update packets, which are then forwarded to a part of the destinations by routers. The packets traveling through different paths may not arrive at the destinations simultaneously, making the AoI of destinations different.
Figure 2: An example of AoI evolution. The $k$-th packet is generated at time $t_k$ and delivered at time $t'_k$. The AoI of destination $u$ is updated upon the reception of a packet update and based on the age of the packet at time $t'_k$.
Figure 3: An example of the tree-generating process. The red nodes denote the selected nodes of a partial solution, and the blue nodes denote the selectable nodes. At each step, a node $v$ is selected and added to the partial solution. The destinations are not shown for simplicity.
Figure 4: Temporal relation between two MDPs. $\mathcal{M}_1$ selects a subset of destinations $a_t\subseteq\mathcal{U}_t$ at each time slot $t$, which is utilized by $\mathcal{M}_2(a_t)$ to generate multicast trees. $\tau$ is a virtual timescale that discretizes the generation process of a multicast tree.
Figure 5: The system architecture of TGMS. A scheduler is performed to select a set of destinations, which are utilized by the tree generator to generate a multicast tree. The initial graph embedding is fed into NGAT layers (blue blocks) to be iteratively updated. The outputs are passed through a pooling layer (green block) to aggregate global information and then forwarded to two distinct heads (red blocks) for policy and value predictions.
...and 5 more figures

Theorems & Definitions (22)

Remark 1
Remark 2
Lemma 1: west2001introduction
Definition 1: Scheduling Subproblem
Definition 2: Tree-generating Subproblem
Remark 3
Lemma 2
Proposition 1
proof
Lemma 3
...and 12 more

Graph Attention Reinforcement Learning for Multicast Routing and Age-Optimal Scheduling

TL;DR

Abstract

Graph Attention Reinforcement Learning for Multicast Routing and Age-Optimal Scheduling

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (10)

Theorems & Definitions (22)