Table of Contents
Fetching ...

CoMet: Metaphor-Driven Covert Communication for Multi-Agent Language Games

Shuhang Xu, Fangwei Zhong

TL;DR

CoMet tackles the challenge that metaphor understanding and generation are difficult for LLM agents in multi-agent language games, hindering covert communication. It introduces a hypothesis-based metaphor reasoning module and a self-improving metaphor generator, integrated with feature extraction, belief modeling, strategy planning, and a metaphor-enabled actor to enable concealment, deception, and misdirection. The framework demonstrates substantial performance gains on Undercover and Adversarial Taboo across multiple LLMs, with ablation studies confirming the contribution of each component. This work broadens the capabilities of AI agents in adversarial and cooperative communication, offering actionable insights for secure negotiation and human-AI collaboration while acknowledging limitations and ethical considerations.

Abstract

Metaphors are a crucial way for humans to express complex or subtle ideas by comparing one concept to another, often from a different domain. However, many large language models (LLMs) struggle to interpret and apply metaphors in multi-agent language games, hindering their ability to engage in covert communication and semantic evasion, which are crucial for strategic communication. To address this challenge, we introduce CoMet, a framework that enables LLM-based agents to engage in metaphor processing. CoMet combines a hypothesis-based metaphor reasoner with a metaphor generator that improves through self-reflection and knowledge integration. This enhances the agents' ability to interpret and apply metaphors, improving the strategic and nuanced quality of their interactions. We evaluate CoMet on two multi-agent language games - Undercover and Adversarial Taboo - which emphasize Covert Communication and Semantic Evasion. Experimental results demonstrate that CoMet significantly enhances the agents' ability to communicate strategically using metaphors.

CoMet: Metaphor-Driven Covert Communication for Multi-Agent Language Games

TL;DR

CoMet tackles the challenge that metaphor understanding and generation are difficult for LLM agents in multi-agent language games, hindering covert communication. It introduces a hypothesis-based metaphor reasoning module and a self-improving metaphor generator, integrated with feature extraction, belief modeling, strategy planning, and a metaphor-enabled actor to enable concealment, deception, and misdirection. The framework demonstrates substantial performance gains on Undercover and Adversarial Taboo across multiple LLMs, with ablation studies confirming the contribution of each component. This work broadens the capabilities of AI agents in adversarial and cooperative communication, offering actionable insights for secure negotiation and human-AI collaboration while acknowledging limitations and ethical considerations.

Abstract

Metaphors are a crucial way for humans to express complex or subtle ideas by comparing one concept to another, often from a different domain. However, many large language models (LLMs) struggle to interpret and apply metaphors in multi-agent language games, hindering their ability to engage in covert communication and semantic evasion, which are crucial for strategic communication. To address this challenge, we introduce CoMet, a framework that enables LLM-based agents to engage in metaphor processing. CoMet combines a hypothesis-based metaphor reasoner with a metaphor generator that improves through self-reflection and knowledge integration. This enhances the agents' ability to interpret and apply metaphors, improving the strategic and nuanced quality of their interactions. We evaluate CoMet on two multi-agent language games - Undercover and Adversarial Taboo - which emphasize Covert Communication and Semantic Evasion. Experimental results demonstrate that CoMet significantly enhances the agents' ability to communicate strategically using metaphors.

Paper Structure

This paper contains 40 sections, 7 equations, 21 figures, 4 tables.

Figures (21)

  • Figure 1: Comparison of three communication strategies—Straightforward Description, Concealment, and Metaphorical Description—in Undercover. In this example, a civilian describes a "butterfly", and the reactions of the two players are shown. In the Straightforward method, the civilian successfully identifies their teammate, but the undercover agent guesses the word. In Concealment, the civilian’s vague clue leads to confusion, with the undercover agent failing to guess the word and the civilian unable to identify their teammate. The Metaphor method allows the civilian to subtly describe the word, leading to a correct identification by the civilian agent, while the undercover agent fails to guess the word.
  • Figure 2: Overview of the CoMet framework, illustrated within the "concept camouflage” task in Undercover. The agent starts by extracting features from the game state, including player behavior and available clues. The Metaphor Reasoner identifies and expands metaphors to aid in interpretation. As the game progresses, the agent uses the Belief Mapper to build beliefs about other players’ roles and tracks its own identity with the Self-Monitor. With this understanding, the Strategy Planner formulates a communication and action strategy. The agent then generates metaphorical speech through the Metaphor Generator to communicate covertly. Finally, it votes according to its assessment, while new dialogue and voting histories are recorded to inform future decisions.
  • Figure 3: The metaphor reasoning process based on hypothesis testing when players holding the word “kite” encounter the statement “homesick bird.” The process involves hypothesizing whether the metaphor refers to a kite (H0) or another object (H1), followed by analysis of features such as flight, lifelessness, and being tethered. Through metaphor expansion and hypothesis testing, the model determines that the metaphor best fits the description of a kite, supporting H0.
  • Figure 4: Performance comparison of different LLMs in Adversarial Taboo. (a) Game result statistics for Naive Agent, Agent with CoT, and Agent with CoMet. (b) Performance of LLMs with various methods when facing an attacker using CoT.
  • Figure 5: Evaluation of the comprehensive performance of CoT and CoMet agents in Undercover game using balanced metrics.
  • ...and 16 more figures