Table of Contents
Fetching ...

Computing Universal Plans for Partially Observable Multi-Agent Routing Using Answer Set Programming

Fengming Zhu, Fangzhen Lin

TL;DR

This work addresses universal planning for partially observable multi-agent routing by computing decentralized policies that map local observations to actions using Answer Set Programming. An ASP-based pipeline translates map layouts and agent FoVs into logic programs to produce policy profiles that guarantee collision-free execution and eventual goal attainment, without centralized real-time coordination. The study analyzes feasibility across map sizes, sensor ranges, and goal configurations, and demonstrates how action heuristics and traffic-rule-like constraints can significantly improve policy quality, including an anytime optimization perspective. The results suggest practical applicability to large-scale, autonomous, centrally-uncoordinated systems and point to future extensions in subregion coordination and learning-integrated rule generation.

Abstract

Multi-agent routing problems have gained significant attention recently due to their wide range of industrial applications, ranging from logistics warehouse automation to indoor service robots. Conventionally, they are modeled as classical planning problems. In this paper, we argue that it can be beneficial to formulate them as universal planning problems, particularly when the agents are autonomous entities and may encounter unforeseen situations. We therefore propose universal plans, also known as policies, as the solution concept, and implement a system based on Answer Set Programming (ASP) to compute them. Given an arbitrary two-dimensional map and a profile of goals for a group of partially observable agents, the system translates the problem configuration into logic programs and finds a feasible universal plan for each agent, mapping its observations to actions while ensuring that there are no collisions with other agents. We use the system to conduct experiments and obtain findings regarding the types of goal profiles and environments that lead to feasible policies, as well as how feasibility may depend on the agents' sensors. We also demonstrate how users can customize action preferences to compute more efficient policies, even (near-)optimal ones. The code is available at https://github.com/Fernadoo/MAPF_ASP.

Computing Universal Plans for Partially Observable Multi-Agent Routing Using Answer Set Programming

TL;DR

This work addresses universal planning for partially observable multi-agent routing by computing decentralized policies that map local observations to actions using Answer Set Programming. An ASP-based pipeline translates map layouts and agent FoVs into logic programs to produce policy profiles that guarantee collision-free execution and eventual goal attainment, without centralized real-time coordination. The study analyzes feasibility across map sizes, sensor ranges, and goal configurations, and demonstrates how action heuristics and traffic-rule-like constraints can significantly improve policy quality, including an anytime optimization perspective. The results suggest practical applicability to large-scale, autonomous, centrally-uncoordinated systems and point to future extensions in subregion coordination and learning-integrated rule generation.

Abstract

Multi-agent routing problems have gained significant attention recently due to their wide range of industrial applications, ranging from logistics warehouse automation to indoor service robots. Conventionally, they are modeled as classical planning problems. In this paper, we argue that it can be beneficial to formulate them as universal planning problems, particularly when the agents are autonomous entities and may encounter unforeseen situations. We therefore propose universal plans, also known as policies, as the solution concept, and implement a system based on Answer Set Programming (ASP) to compute them. Given an arbitrary two-dimensional map and a profile of goals for a group of partially observable agents, the system translates the problem configuration into logic programs and finds a feasible universal plan for each agent, mapping its observations to actions while ensuring that there are no collisions with other agents. We use the system to conduct experiments and obtain findings regarding the types of goal profiles and environments that lead to feasible policies, as well as how feasibility may depend on the agents' sensors. We also demonstrate how users can customize action preferences to compute more efficient policies, even (near-)optimal ones. The code is available at https://github.com/Fernadoo/MAPF_ASP.
Paper Structure (21 sections, 5 theorems, 1 equation, 8 figures, 7 tables)

This paper contains 21 sections, 5 theorems, 1 equation, 8 figures, 7 tables.

Key Result

Theorem 1

This ASP implementation (with the complete version attached in Appendix apd:prob_encoding) is both sound and complete. More specifically, 1) given an answer set of a logic program described above, the policy profile encoded in the answer set by the atoms will be a feasible one; 2) conversely, one can construct an answer set of the above logic program from a feasible policy profile.

Figures (8)

  • Figure 1: An example (two chambers connected by a tight hallway) that shows how our proposed policy-like solution works. Once computed (taking 167.35s for searching), it will be applicable for all $19\times 18\times 17 = 5814$ global states. Colored squares are the goals assigned to sphere-shaped agents in corresponding colors. When two agents are connected by a dotted line, it means they can detect each other.
  • Figure 2: Sum-of-makespan evaluated on different maps with different preference heuristics.
  • Figure 3: Orange cells denote the choices of $g_2$ that lead to feasible policies.
  • Figure 4: Manually designed layouts for the results reported in Table \ref{['tab:sensor_man']}. From the left to the right and the top to the bottom, layouts are identified as id_1 to id_6.
  • Figure 5: Orange cells denote possible $g_2$'s that lead to feasible policies. (a) Left: the corner case; (b) Right: the boundary case (other than corners).
  • ...and 3 more figures

Theorems & Definitions (11)

  • Theorem 1: Correctness
  • Definition 1
  • Proposition 1
  • proof
  • Definition 2: Crossroads and Streets
  • Theorem 2
  • Definition 3
  • Proposition 2
  • Proposition 3
  • proof
  • ...and 1 more