Can LLM-Augmented autonomous agents cooperate?, An evaluation of their cooperative capabilities through Melting Pot
Manuel Mosquera, Juan Sebastian Pinzon, Manuel Rios, Yesid Fonseca, Luis Felipe Giraldo, Nicanor Quijano, Ruben Manrique
TL;DR
The paper investigates whether LLM-augmented autonomous agents can cooperate effectively using Melting Pot's Commons Harvest, introducing an abstraction layer and a reusable architecture with memory and cognitive modules. It adapts Melting Pot to natural-language observations for LLMs and evaluates cooperative capabilities with several personality and scenario manipulations, comparing to GPT-4/3.5 and bots. Findings show a tendency to cooperate but limited understanding of effective collaboration; coordination failures and resource depletion indicate the need for an enhanced architecture featuring understanding, communication, institutions, and reputation modules. The work contributes a practical framework, discusses limitations, and outlines a path toward more robust cooperative LAAs.
Abstract
As the field of AI continues to evolve, a significant dimension of this progression is the development of Large Language Models and their potential to enhance multi-agent artificial intelligence systems. This paper explores the cooperative capabilities of Large Language Model-augmented Autonomous Agents (LAAs) using the well-known Meltin Pot environments along with reference models such as GPT4 and GPT3.5. Preliminary results suggest that while these agents demonstrate a propensity for cooperation, they still struggle with effective collaboration in given environments, emphasizing the need for more robust architectures. The study's contributions include an abstraction layer to adapt Melting Pot game scenarios for LLMs, the implementation of a reusable architecture for LLM-mediated agent development - which includes short and long-term memories and different cognitive modules, and the evaluation of cooperation capabilities using a set of metrics tied to the Melting Pot's "Commons Harvest" game. The paper closes, by discussing the limitations of the current architectural framework and the potential of a new set of modules that fosters better cooperation among LAAs.
