Table of Contents
Fetching ...

Scaling Multi-Agent Epistemic Planning through GNN-Derived Heuristics

Giovanni Briglia, Francesco Fabiano, Stefano Mariani

TL;DR

This work tackles the scalability challenge of Multi-Agent Epistemic Planning by learning heuristics from epistemic states modeled as Kripke structures. It uses Graph Neural Networks to encode e-state graphs and predict a distance-to-goal surrogate, enabling Best-First Search guided exploration. The authors compare three Kripke-embedding schemes, develop a data-generation pipeline, and integrate a GNN-based distance estimator into a planning pipeline, showing improved scalability and generalization across standard MEP benchmarks. They also discuss limitations and future directions, including online learning and integration with more advanced search methods like MCTS. Overall, this approach provides a principled, data-driven pathway to enhance heuristic guidance in complex epistemic planning tasks.

Abstract

Multi-agent Epistemic Planning (MEP) is an autonomous planning framework for reasoning about both the physical world and the beliefs of agents, with applications in domains where information flow and awareness among agents are critical. The richness of MEP requires states to be represented as Kripke structures, i.e., directed labeled graphs. This representation limits the applicability of existing heuristics, hindering the scalability of epistemic solvers, which must explore an exponential search space without guidance, resulting often in intractability. To address this, we exploit Graph Neural Networks (GNNs) to learn patterns and relational structures within epistemic states, to guide the planning process. GNNs, which naturally capture the graph-like nature of Kripke models, allow us to derive meaningful estimates of state quality -- e.g., the distance from the nearest goal -- by generalizing knowledge obtained from previously solved planning instances. We integrate these predictive heuristics into an epistemic planning pipeline and evaluate them against standard baselines, showing improvements in the scalability of multi-agent epistemic planning.

Scaling Multi-Agent Epistemic Planning through GNN-Derived Heuristics

TL;DR

This work tackles the scalability challenge of Multi-Agent Epistemic Planning by learning heuristics from epistemic states modeled as Kripke structures. It uses Graph Neural Networks to encode e-state graphs and predict a distance-to-goal surrogate, enabling Best-First Search guided exploration. The authors compare three Kripke-embedding schemes, develop a data-generation pipeline, and integrate a GNN-based distance estimator into a planning pipeline, showing improved scalability and generalization across standard MEP benchmarks. They also discuss limitations and future directions, including online learning and integration with more advanced search methods like MCTS. Overall, this approach provides a principled, data-driven pathway to enhance heuristic guidance in complex epistemic planning tasks.

Abstract

Multi-agent Epistemic Planning (MEP) is an autonomous planning framework for reasoning about both the physical world and the beliefs of agents, with applications in domains where information flow and awareness among agents are critical. The richness of MEP requires states to be represented as Kripke structures, i.e., directed labeled graphs. This representation limits the applicability of existing heuristics, hindering the scalability of epistemic solvers, which must explore an exponential search space without guidance, resulting often in intractability. To address this, we exploit Graph Neural Networks (GNNs) to learn patterns and relational structures within epistemic states, to guide the planning process. GNNs, which naturally capture the graph-like nature of Kripke models, allow us to derive meaningful estimates of state quality -- e.g., the distance from the nearest goal -- by generalizing knowledge obtained from previously solved planning instances. We integrate these predictive heuristics into an epistemic planning pipeline and evaluate them against standard baselines, showing improvements in the scalability of multi-agent epistemic planning.

Paper Structure

This paper contains 42 sections, 1 equation, 1 figure, 8 tables, 2 algorithms.

Figures (1)

  • Figure 1: Illustration of the overall training and inference process. On the left, we show dataset generation via DFS: teal nodes represent goals (score 0), black dotted arrows show backtracking assigning distances, orange nodes ('x') indicate discarded branches, and gray nodes ('f') are states with no reachable goal. Training is shown by the blue dashdotted lines: $\langle$e-state, distance$\rangle$ pairs generated by the DFS are fed into the GNN to learn the properties of the e-states. Following the magenta dashed lines, we instead illustrate Inference where a single e-state---shown in its expanded graph view for clarity---is input to the GNN to retrieve its estimated distance to the goal. The teal portion of the expanded node represents the goal encoding, while the magenta portion represents the actual e-state.

Theorems & Definitions (2)

  • definition 1: Pointed Kripke structure
  • definition 2: Multi-agent epistemic planning problem