Table of Contents
Fetching ...

Towards Collaborative Intelligence: Propagating Intentions and Reasoning for Multi-Agent Coordination with Large Language Models

Xihe Qiu, Haoyu Wang, Xiaoyu Tan, Chao Qu, Yujie Xiong, Yuan Cheng, Yinghui Xu, Wei Chu, Yuan Qi

TL;DR

The paper tackles coordinating multiple LLM-based agents in cooperative MARL by introducing ReMALIS, which propagates private intentions and enables bidirectional feedback to align planning, grounding, and execution. ReMALIS consists of planning, grounding, and cooperative execution modules connected through intention propagation channels, allowing agents to maintain and share intentions to coordinate sub-tasks. A joint training objective balances grounding uncertainty, miscoordination penalties, and team rewards, with recursive reasoning enhancing contextual understanding for complex tasks. Empirical results on traffic flow prediction and web-activity tasks show ReMALIS outperforming strong baselines, including larger single-agent models, with a parameter-efficient 7B model achieving competitive performance through cooperative, intention-guided coordination.

Abstract

Effective collaboration in multi-agent systems requires communicating goals and intentions between agents. Current agent frameworks often suffer from dependencies on single-agent execution and lack robust inter-module communication, frequently leading to suboptimal multi-agent reinforcement learning (MARL) policies and inadequate task coordination. To address these challenges, we present a framework for training large language models (LLMs) as collaborative agents to enable coordinated behaviors in cooperative MARL. Each agent maintains a private intention consisting of its current goal and associated sub-tasks. Agents broadcast their intentions periodically, allowing other agents to infer coordination tasks. A propagation network transforms broadcast intentions into teammate-specific communication messages, sharing relevant goals with designated teammates. The architecture of our framework is structured into planning, grounding, and execution modules. During execution, multiple agents interact in a downstream environment and communicate intentions to enable coordinated behaviors. The grounding module dynamically adapts comprehension strategies based on emerging coordination patterns, while feedback from execution agents influnces the planning module, enabling the dynamic re-planning of sub-tasks. Results in collaborative environment simulation demonstrate intention propagation reduces miscoordination errors by aligning sub-task dependencies between agents. Agents learn when to communicate intentions and which teammates require task details, resulting in emergent coordinated behaviors. This demonstrates the efficacy of intention sharing for cooperative multi-agent RL based on LLMs.

Towards Collaborative Intelligence: Propagating Intentions and Reasoning for Multi-Agent Coordination with Large Language Models

TL;DR

The paper tackles coordinating multiple LLM-based agents in cooperative MARL by introducing ReMALIS, which propagates private intentions and enables bidirectional feedback to align planning, grounding, and execution. ReMALIS consists of planning, grounding, and cooperative execution modules connected through intention propagation channels, allowing agents to maintain and share intentions to coordinate sub-tasks. A joint training objective balances grounding uncertainty, miscoordination penalties, and team rewards, with recursive reasoning enhancing contextual understanding for complex tasks. Empirical results on traffic flow prediction and web-activity tasks show ReMALIS outperforming strong baselines, including larger single-agent models, with a parameter-efficient 7B model achieving competitive performance through cooperative, intention-guided coordination.

Abstract

Effective collaboration in multi-agent systems requires communicating goals and intentions between agents. Current agent frameworks often suffer from dependencies on single-agent execution and lack robust inter-module communication, frequently leading to suboptimal multi-agent reinforcement learning (MARL) policies and inadequate task coordination. To address these challenges, we present a framework for training large language models (LLMs) as collaborative agents to enable coordinated behaviors in cooperative MARL. Each agent maintains a private intention consisting of its current goal and associated sub-tasks. Agents broadcast their intentions periodically, allowing other agents to infer coordination tasks. A propagation network transforms broadcast intentions into teammate-specific communication messages, sharing relevant goals with designated teammates. The architecture of our framework is structured into planning, grounding, and execution modules. During execution, multiple agents interact in a downstream environment and communicate intentions to enable coordinated behaviors. The grounding module dynamically adapts comprehension strategies based on emerging coordination patterns, while feedback from execution agents influnces the planning module, enabling the dynamic re-planning of sub-tasks. Results in collaborative environment simulation demonstrate intention propagation reduces miscoordination errors by aligning sub-task dependencies between agents. Agents learn when to communicate intentions and which teammates require task details, resulting in emergent coordinated behaviors. This demonstrates the efficacy of intention sharing for cooperative multi-agent RL based on LLMs.
Paper Structure (29 sections, 12 equations, 7 figures, 6 tables, 4 algorithms)

This paper contains 29 sections, 12 equations, 7 figures, 6 tables, 4 algorithms.

Figures (7)

  • Figure 1: This framework introduces a multi-agent learning strategy designed to enhance the capabilities of LLMs through cooperative coordination. It enables agents to collaborate and share intentions for effective coordination, and utilizes recursive reasoning to model and adapt to each other's strategies.
  • Figure 2: Overview of the proposed ReMALIS: This framework comprises a planning module, grounding module, cooperative execution module, and intention coordination channels.
  • Figure 3: Comparative performance evaluation across varying task difficulty levels for the web activities dataset, which indicates the accuracy scores achieved by ReMALIS and several state-of-the-art baselines.
  • Figure 4: Overview of the proposed ReMALIS Planning Module for predicting sub-goals based on current goals, intentions, grounded embeddings, and agent feedback.
  • Figure 5: Framework of the proposed ReMALIS Grounding Module that contextualizes symbol embeddings using the current state, intentions, and feedback signals.
  • ...and 2 more figures