Table of Contents
Fetching ...

Can LLM-Augmented autonomous agents cooperate?, An evaluation of their cooperative capabilities through Melting Pot

Manuel Mosquera, Juan Sebastian Pinzon, Manuel Rios, Yesid Fonseca, Luis Felipe Giraldo, Nicanor Quijano, Ruben Manrique

TL;DR

The paper investigates whether LLM-augmented autonomous agents can cooperate effectively using Melting Pot's Commons Harvest, introducing an abstraction layer and a reusable architecture with memory and cognitive modules. It adapts Melting Pot to natural-language observations for LLMs and evaluates cooperative capabilities with several personality and scenario manipulations, comparing to GPT-4/3.5 and bots. Findings show a tendency to cooperate but limited understanding of effective collaboration; coordination failures and resource depletion indicate the need for an enhanced architecture featuring understanding, communication, institutions, and reputation modules. The work contributes a practical framework, discusses limitations, and outlines a path toward more robust cooperative LAAs.

Abstract

As the field of AI continues to evolve, a significant dimension of this progression is the development of Large Language Models and their potential to enhance multi-agent artificial intelligence systems. This paper explores the cooperative capabilities of Large Language Model-augmented Autonomous Agents (LAAs) using the well-known Meltin Pot environments along with reference models such as GPT4 and GPT3.5. Preliminary results suggest that while these agents demonstrate a propensity for cooperation, they still struggle with effective collaboration in given environments, emphasizing the need for more robust architectures. The study's contributions include an abstraction layer to adapt Melting Pot game scenarios for LLMs, the implementation of a reusable architecture for LLM-mediated agent development - which includes short and long-term memories and different cognitive modules, and the evaluation of cooperation capabilities using a set of metrics tied to the Melting Pot's "Commons Harvest" game. The paper closes, by discussing the limitations of the current architectural framework and the potential of a new set of modules that fosters better cooperation among LAAs.

Can LLM-Augmented autonomous agents cooperate?, An evaluation of their cooperative capabilities through Melting Pot

TL;DR

The paper investigates whether LLM-augmented autonomous agents can cooperate effectively using Melting Pot's Commons Harvest, introducing an abstraction layer and a reusable architecture with memory and cognitive modules. It adapts Melting Pot to natural-language observations for LLMs and evaluates cooperative capabilities with several personality and scenario manipulations, comparing to GPT-4/3.5 and bots. Findings show a tendency to cooperate but limited understanding of effective collaboration; coordination failures and resource depletion indicate the need for an enhanced architecture featuring understanding, communication, institutions, and reputation modules. The work contributes a practical framework, discusses limitations, and outlines a path toward more robust cooperative LAAs.

Abstract

As the field of AI continues to evolve, a significant dimension of this progression is the development of Large Language Models and their potential to enhance multi-agent artificial intelligence systems. This paper explores the cooperative capabilities of Large Language Model-augmented Autonomous Agents (LAAs) using the well-known Meltin Pot environments along with reference models such as GPT4 and GPT3.5. Preliminary results suggest that while these agents demonstrate a propensity for cooperation, they still struggle with effective collaboration in given environments, emphasizing the need for more robust architectures. The study's contributions include an abstraction layer to adapt Melting Pot game scenarios for LLMs, the implementation of a reusable architecture for LLM-mediated agent development - which includes short and long-term memories and different cognitive modules, and the evaluation of cooperation capabilities using a set of metrics tied to the Melting Pot's "Commons Harvest" game. The paper closes, by discussing the limitations of the current architectural framework and the potential of a new set of modules that fosters better cooperation among LAAs.
Paper Structure (35 sections, 11 figures, 2 tables)

This paper contains 35 sections, 11 figures, 2 tables.

Figures (11)

  • Figure 1: This is a screen capture of a running simulation of the Commons Harvest scenario. Bots can be identified by their arms and legs of color black.
  • Figure 2: The flow diagram for an action taken by an LLM agent.
  • Figure 3: The per capita average reward of the agents by scenario. Ten simulations were performed per scenario to assess how the agents' assigned personalities could affect population welfare. The scenario with no particular personality assigned exhibited the best per capita reward, followed by scenarios where agents were instructed to be selfish, and lastly, the worst performance was observed in scenarios where agents were instructed to be cooperative.
  • Figure 4: The number of times the agents decided to attack and the number of times the attacks were effective, i.e., the number of times the attack hit the other agent, thus removing the agent from the game for the next five moves. The scenarios All selfish and Without personality registered a higher number of attacks, while the scenarios All coop. and All coop. with def. showed the least number of attacks.
  • Figure 5: Indicator of the number of times the agent closed the distance towards the last apple of a tree divided by the times the last apple of a tree was the nearest to the agent. The results show that there are no important differences between the first set of scenarios.
  • ...and 6 more figures