Table of Contents
Fetching ...

TAG: A Decentralized Framework for Multi-Agent Hierarchical Reinforcement Learning

Giuseppe Paolo, Abdelhakim Benechehab, Hamza Cherkaoui, Albert Thomas, Balázs Kégl

TL;DR

TAG introduces a decentralized, arbitrarily deep hierarchical MARL framework built around the LevelEnv abstraction, which treats each hierarchy level as the environment for the level above. By enabling bidirectional information flow—bottom-up messages/rewards and top-down observational shaping—TAG preserves agent autonomy while enabling coordinated multi-scale control across heterogeneous agents. Empirical results on standard benchmarks (e.g., Simple Spread and Balance) show that deeper hierarchies improve sample efficiency and final performance, with learned inter-level communication (3PPO-comm) offering additional gains in some tasks. The framework’s modularity and standardized LevelEnv interfaces support scalable coordination with loose coupling, suggesting practical impact for complex, multi-agent systems and future extensions into dynamic hierarchies and model-based planning.

Abstract

Hierarchical organization is fundamental to biological systems and human societies, yet artificial intelligence systems often rely on monolithic architectures that limit adaptability and scalability. Current hierarchical reinforcement learning (HRL) approaches typically restrict hierarchies to two levels or require centralized training, which limits their practical applicability. We introduce TAME Agent Framework (TAG), a framework for constructing fully decentralized hierarchical multi-agent systems. TAG enables hierarchies of arbitrary depth through a novel LevelEnv concept, which abstracts each hierarchy level as the environment for the agents above it. This approach standardizes information flow between levels while preserving loose coupling, allowing for seamless integration of diverse agent types. We demonstrate the effectiveness of TAG by implementing hierarchical architectures that combine different RL agents across multiple levels, achieving improved performance over classical multi-agent RL baselines on standard benchmarks. Our results show that decentralized hierarchical organization enhances both learning speed and final performance, positioning TAG as a promising direction for scalable multi-agent systems.

TAG: A Decentralized Framework for Multi-Agent Hierarchical Reinforcement Learning

TL;DR

TAG introduces a decentralized, arbitrarily deep hierarchical MARL framework built around the LevelEnv abstraction, which treats each hierarchy level as the environment for the level above. By enabling bidirectional information flow—bottom-up messages/rewards and top-down observational shaping—TAG preserves agent autonomy while enabling coordinated multi-scale control across heterogeneous agents. Empirical results on standard benchmarks (e.g., Simple Spread and Balance) show that deeper hierarchies improve sample efficiency and final performance, with learned inter-level communication (3PPO-comm) offering additional gains in some tasks. The framework’s modularity and standardized LevelEnv interfaces support scalable coordination with loose coupling, suggesting practical impact for complex, multi-agent systems and future extensions into dynamic hierarchies and model-based planning.

Abstract

Hierarchical organization is fundamental to biological systems and human societies, yet artificial intelligence systems often rely on monolithic architectures that limit adaptability and scalability. Current hierarchical reinforcement learning (HRL) approaches typically restrict hierarchies to two levels or require centralized training, which limits their practical applicability. We introduce TAME Agent Framework (TAG), a framework for constructing fully decentralized hierarchical multi-agent systems. TAG enables hierarchies of arbitrary depth through a novel LevelEnv concept, which abstracts each hierarchy level as the environment for the agents above it. This approach standardizes information flow between levels while preserving loose coupling, allowing for seamless integration of diverse agent types. We demonstrate the effectiveness of TAG by implementing hierarchical architectures that combine different RL agents across multiple levels, achieving improved performance over classical multi-agent RL baselines on standard benchmarks. Our results show that decentralized hierarchical organization enhances both learning speed and final performance, positioning TAG as a promising direction for scalable multi-agent systems.

Paper Structure

This paper contains 25 sections, 16 equations, 4 figures, 1 algorithm.

Figures (4)

  • Figure 1: Three- and two-level hierarchical agents used in the four-agent MPE-Spread environment. Yellow boxes represent the hierarchy levels, while blue connections indicate what each agent perceives as its environment. Red connections illustrate how the agents in the real environment are controlled, and green boxes represent the goals that the agents must reach.
  • Figure 2: Representation of the information flows between a level $l$ with two agents and the levels above and below. The top-down flow of actions is shown in blue. The bottom-up flux of messages and rewards is shown in red and green, respectively.
  • Figure 3: Mean average reward in the MPE-Spread environment (a) and Balance environment (b). Mean is calculated over 5 random seeds. Shaded areas represent 95% confidence intervals. Dotted red line in (a) shows the performance of an hand-designed heuristic.
  • Figure 4: Action distributions between top and bottom agents in MAPPO-PPO. (a) The bottom agent receives actions from the top. (b) The bottom agent does not receive actions from the top.