Table of Contents
Fetching ...

Combining Planning and Reinforcement Learning for Solving Relational Multiagent Domains

Nikhilesh Prabhakar, Ranveer Singh, Harsha Kokel, Sriraam Natarajan, Prasad Tadepalli

TL;DR

MaRePReL tackles the sample inefficiency and non-stationarity of relational multiagent reinforcement learning by integrating a relational hierarchical planner as a centralized controller with task-specific state abstractions and low-level deep RL. The problem is formalized as a goal-directed relational Markov game (GRMG), solved through a planner-driven task distributor and operator-specific RL policies, guided by dynamic D-FOCI abstractions. The work presents the first relational multiagent system that generalizes across different numbers of objects and relations, demonstrates a cohesive architecture combining planning, abstraction, and learning, and shows superior sample efficiency, transfer, and generalization across three relational domains. It also notes practical limitations and points to future directions in scaling, partial observability, and differentiable end-to-end implementations.

Abstract

Multiagent Reinforcement Learning (MARL) poses significant challenges due to the exponential growth of state and action spaces and the non-stationary nature of multiagent environments. This results in notable sample inefficiency and hinders generalization across diverse tasks. The complexity is further pronounced in relational settings, where domain knowledge is crucial but often underutilized by existing MARL algorithms. To overcome these hurdles, we propose integrating relational planners as centralized controllers with efficient state abstractions and reinforcement learning. This approach proves to be sample-efficient and facilitates effective task transfer and generalization.

Combining Planning and Reinforcement Learning for Solving Relational Multiagent Domains

TL;DR

MaRePReL tackles the sample inefficiency and non-stationarity of relational multiagent reinforcement learning by integrating a relational hierarchical planner as a centralized controller with task-specific state abstractions and low-level deep RL. The problem is formalized as a goal-directed relational Markov game (GRMG), solved through a planner-driven task distributor and operator-specific RL policies, guided by dynamic D-FOCI abstractions. The work presents the first relational multiagent system that generalizes across different numbers of objects and relations, demonstrates a cohesive architecture combining planning, abstraction, and learning, and shows superior sample efficiency, transfer, and generalization across three relational domains. It also notes practical limitations and points to future directions in scaling, partial observability, and differentiable end-to-end implementations.

Abstract

Multiagent Reinforcement Learning (MARL) poses significant challenges due to the exponential growth of state and action spaces and the non-stationary nature of multiagent environments. This results in notable sample inefficiency and hinders generalization across diverse tasks. The complexity is further pronounced in relational settings, where domain knowledge is crucial but often underutilized by existing MARL algorithms. To overcome these hurdles, we propose integrating relational planners as centralized controllers with efficient state abstractions and reinforcement learning. This approach proves to be sample-efficient and facilitates effective task transfer and generalization.

Paper Structure

This paper contains 31 sections, 2 equations, 5 figures, 6 tables, 3 algorithms.

Figures (5)

  • Figure 1: Our proposed framework w.r.t existing literature on relational, hierarchical, and multiagent RL
  • Figure 2: MaRePReL architecture and application in the taxi world for the task of transporting two passengers
  • Figure 3: The success rates across different methodologies for the Taxi, Office World, and Dungeon domains
  • Figure 4: The success rate in case of transferring from policy for one task to another across different methodologies for the Taxi, Office World, and Dungeon Domains
  • Figure 5: Relational Multiagent Domains