LLM-based multi-agent poetry generation in non-cooperative environments

Ran Zhang; Steffen Eger

LLM-based multi-agent poetry generation in non-cooperative environments

Ran Zhang, Steffen Eger

TL;DR

This paper argues for a paradigm shift in creative tasks such as automatic poetry generation to include social learning processes (via LLM-based agent modeling) similar to human interaction.

Abstract

Despite substantial progress of large language models (LLMs) for automatic poetry generation, the generated poetry lacks diversity while the training process differs greatly from human learning. Under the rationale that the learning process of the poetry generation systems should be more human-like and their output more diverse and novel, we introduce a framework based on social learning where we emphasize non-cooperative interactions besides cooperative interactions to encourage diversity. Our experiments are the first attempt at LLM-based multi-agent systems in non-cooperative environments for poetry generation employing both TRAINING-BASED agents (GPT-2) and PROMPTING-BASED agents (GPT-3 and GPT-4). Our evaluation based on 96k generated poems shows that our framework benefits the poetry generation process for TRAINING-BASED agents resulting in 1) a 3.0-3.7 percentage point (pp) increase in diversity and a 5.6-11.3 pp increase in novelty according to distinct and novel n-grams. The generated poetry from TRAINING-BASED agents also exhibits group divergence in terms of lexicons, styles and semantics. PROMPTING-BASED agents in our framework also benefit from non-cooperative environments and a more diverse ensemble of models with non-homogeneous agents has the potential to further enhance diversity, with an increase of 7.0-17.5 pp according to our experiments. However, PROMPTING-BASED agents show a decrease in lexical diversity over time and do not exhibit the group-based divergence intended in the social network. Our paper argues for a paradigm shift in creative tasks such as automatic poetry generation to include social learning processes (via LLM-based agent modeling) similar to human interaction.

LLM-based multi-agent poetry generation in non-cooperative environments

TL;DR

This paper argues for a paradigm shift in creative tasks such as automatic poetry generation to include social learning processes (via LLM-based agent modeling) similar to human interaction.

Abstract

Paper Structure (27 sections, 1 equation, 6 figures, 13 tables, 1 algorithm)

This paper contains 27 sections, 1 equation, 6 figures, 13 tables, 1 algorithm.

Introduction
Related work
Social learning framework for poetry generation
The social network
The learning process
The learning strategy
Training-based agents
Prompting-based agents
Experiments
Agent initialization
Experimental setup
Evaluation
Experiment results
Automatic Evaluation: the generation dynamics of agents
Increasing diversity and novelty according to distinct and novel n-grams for training-based agents
...and 12 more sections

Figures (6)

Figure 1: Illustration of the predefined social network ($M=4$) and the high-level description of the learning process for training-based agents (GPT-2) and prompting-based agents (GPT-3.5 and GPT-4). The green and red lines in the social network indicate cooperative (+) and non-cooperative (-) interaction between agents respectively.
Figure 2: Dynamics of agent diversity and novelty over varying training parameters. The degree of diversity is measured by the percentage of distinct uni-grams (distinct-1) and bi-grams (distinct-2) in the generated poems. The degree of novelty is measured by the number of novel uni-grams (novelty-1) and bi-grams (novelty-2) in the generated poems compared to that in training data scaled by the total number of generated tokens. (a) The effect of scaling parameter $\alpha$ in Equation (\ref{['equ:decode']}). (b) The effect of the number of interactive agents #$\mathcal{A}$ during the decoding stage. (c) The effect of finetuning strategies: $\mathcal{L}_{\text{CE}}$ and $\mathcal{L}_{\text{CL}}$ indicate Cross-Entropy loss and Contrastive loss. Prefix refers to the conditioned finetuning. The horizontal red dashed line indicates the state of initial agents at iteration 0.
Figure 3: Dynamics of diversity for prompting-based agents over varying prompting strategies based on GPT-3.5 and GPT-4.
Figure 4: Divergence of training-based agents measured by mean of pairwise semantic similarity over varying training parameters. (a) The effect of scaling parameter $\alpha$ in Equation (\ref{['equ:decode']}). (b) The effect of the number of interactive agents #$\mathcal{A}$ during the decoding stage. (c) The effect of finetuning strategies: $\mathcal{L}_{\text{CE}}$ and $\mathcal{L}_{\text{CL}}$ indicate Cross-Entropy loss and contrastive loss. Prefix refers to the conditioned finetuning. The solid line and dashed line represent semantic similarity measured for 'in-group' and 'out-group' affiliations respectively.
Figure 5: Group divergence of prompting-based agents measured by the mean of pairwise semantic similarity over varying prompting strategies and base model. The solid line and dashed line represent semantic similarity measured for 'in-group' and 'out-group' affiliations respectively.
...and 1 more figures

LLM-based multi-agent poetry generation in non-cooperative environments

TL;DR

Abstract

LLM-based multi-agent poetry generation in non-cooperative environments

Authors

TL;DR

Abstract

Table of Contents

Figures (6)