Table of Contents
Fetching ...

AsymPuzl: An Asymmetric Puzzle for multi-agent cooperation

Xavier Cadet, Edward Koh, Peter Chin

TL;DR

AsymPuzl introduces a minimal two-agent puzzle environment to study communication under information asymmetry in multi-turn LLM interactions. The approach provides a controllable testbed with tunable puzzle size and feedback to isolate information sharing dynamics. Key findings show strong models can converge with complete information sharing, while weaker models struggle with miscommunication; feedback granularity significantly influences performance, with self-feedback generally helping and detailed joint feedback sometimes hindering. The work demonstrates AsymPuzl's value for probing coordination strategies and guides future research on noisy views, bandwidth constraints, and scaling to more agents.

Abstract

Large Language Model (LLM) agents are increasingly studied in multi-turn, multi-agent scenarios, yet most existing setups emphasize open-ended role-play rather than controlled evaluation. We introduce AsymPuzl, a minimal but expressive two-agent puzzle environment designed to isolate communication under information asymmetry. Each agent observes complementary but incomplete views of a symbolic puzzle and must exchange messages to solve it cooperatively. Using a diverse set of current-generation and open-source LLMs, we show that (i) strong models such as GPT-5 and Claude-4.0 reliably converge across puzzle sizes on the solution by sharing complete information in two turns, (ii) weaker models often ignore partner messages or over-correct their hypotheses, and (iii) feedback design is non-trivial: simple self-feedback improves success rates, while detailed joint feedback can hurt performance. These findings show that even in simple cooperative tasks, LLM communication strategies diverge and depend on the granularity of feedback signals. AsymPuzl thus provides a testbed for probing the limits of multi-turn cooperation and opens avenues for studying coordination mechanisms.

AsymPuzl: An Asymmetric Puzzle for multi-agent cooperation

TL;DR

AsymPuzl introduces a minimal two-agent puzzle environment to study communication under information asymmetry in multi-turn LLM interactions. The approach provides a controllable testbed with tunable puzzle size and feedback to isolate information sharing dynamics. Key findings show strong models can converge with complete information sharing, while weaker models struggle with miscommunication; feedback granularity significantly influences performance, with self-feedback generally helping and detailed joint feedback sometimes hindering. The work demonstrates AsymPuzl's value for probing coordination strategies and guides future research on noisy views, bandwidth constraints, and scaling to more agents.

Abstract

Large Language Model (LLM) agents are increasingly studied in multi-turn, multi-agent scenarios, yet most existing setups emphasize open-ended role-play rather than controlled evaluation. We introduce AsymPuzl, a minimal but expressive two-agent puzzle environment designed to isolate communication under information asymmetry. Each agent observes complementary but incomplete views of a symbolic puzzle and must exchange messages to solve it cooperatively. Using a diverse set of current-generation and open-source LLMs, we show that (i) strong models such as GPT-5 and Claude-4.0 reliably converge across puzzle sizes on the solution by sharing complete information in two turns, (ii) weaker models often ignore partner messages or over-correct their hypotheses, and (iii) feedback design is non-trivial: simple self-feedback improves success rates, while detailed joint feedback can hurt performance. These findings show that even in simple cooperative tasks, LLM communication strategies diverge and depend on the granularity of feedback signals. AsymPuzl thus provides a testbed for probing the limits of multi-turn cooperation and opens avenues for studying coordination mechanisms.

Paper Structure

This paper contains 37 sections, 6 figures, 6 tables.

Figures (6)

  • Figure 1: Overview of the puzzle: the ground truth is first created, then each agents' individual partial views are generated and shared with them as clues. The working hypothesis starts as a copy of the clues. Then, in a turn-based interaction, the agents, Alice and Bob , can send each other messages and update their working hypothesis until their hypotheses match the ground truth.
  • Figure 2: (Lower is better) Average number of actions per position. GPT-5 and Claude-4.0 are close to optimal on average with few positional modifications. Meanwhile, GPT-3.5-turbo tends not to modify positions despite the puzzle being unsolved, and Llama 3.2-11B tends to modify positions more than 4 times on average.
  • Figure 3: Comparison of the success rate of each LLM model over time (turns) for 5-piece puzzles with No feedback or Both feedback. Providing feedback about both sides of the puzzle increases success rate.
  • Figure 4: Example of successful completion using Claude 4.0. By the end of the second turn, the puzzle is solved, as both agents shared all of their information and cooperated.
  • Figure 5: Example of lack of cooperation using Llama 3.2-11B. Both agents ignore one another's messages.
  • ...and 1 more figures