A Fairness-Driven Method for Learning Human-Compatible Negotiation Strategies

Ryan Shea; Zhou Yu

A Fairness-Driven Method for Learning Human-Compatible Negotiation Strategies

Ryan Shea, Zhou Yu

TL;DR

FDHC presents a fairness-driven framework for learning human-compatible negotiation strategies by targeting the Egalitarian Bargaining Solution (EBS) within an extensive-form Nash bargaining game. It introduces LGM-Zero, a reinforcement-learning plus search method that combines a pre-trained language model with a value network to retrieve and evaluate human-like offers in large action spaces, guided by Monte Carlo tree search. Theoretical guarantees show convergence to Nash equilibrium under mild assumptions and potential convergence to the EBS under stronger conditions, while experiments demonstrate improved fairness and competitive negotiation quality relative to strong LLM baselines. The approach is demonstrated in a single-issue car bargaining scenario, with automatic and human evaluations indicating FDHC yields more egalitarian outcomes and comparable human-likeness to GPT-4. This work offers a practical, theory-grounded path to human-compatible negotiation agents and suggests future extensions to alternative bargaining solutions and broader negotiation domains.

Abstract

Despite recent advancements in AI and NLP, negotiation remains a difficult domain for AI agents. Traditional game theoretic approaches that have worked well for two-player zero-sum games struggle in the context of negotiation due to their inability to learn human-compatible strategies. On the other hand, approaches that only use human data tend to be domain-specific and lack the theoretical guarantees provided by strategies grounded in game theory. Motivated by the notion of fairness as a criterion for optimality in general sum games, we propose a negotiation framework called FDHC which incorporates fairness into both the reward design and search to learn human-compatible negotiation strategies. Our method includes a novel, RL+search technique called LGM-Zero which leverages a pre-trained language model to retrieve human-compatible offers from large action spaces. Our results show that our method is able to achieve more egalitarian negotiation outcomes and improve negotiation quality.

A Fairness-Driven Method for Learning Human-Compatible Negotiation Strategies

TL;DR

Abstract

Paper Structure (36 sections, 6 theorems, 4 equations, 6 figures, 13 tables)

This paper contains 36 sections, 6 theorems, 4 equations, 6 figures, 13 tables.

Introduction
Background
Related Work
Method
FDHC Negotiation Framework
LGM-Zero
Inference
Training
Implementation
Theoretical Analysis
Experiments
Baselines
Automatic Evaluation
Human Evaluation
Conclusion and Future Work
...and 21 more sections

Key Result

Theorem 1

Let $t_n$ denote the FDHC's final turn in the negotiation, let $\alpha$ denote the outcome proposed at $t_{n-1}$, and let EBS($x$) denote the EBS value for some outcome $x$. Setting FDHC's estimate of $S = \mathop{\mathrm{arg\,max}}\limits(\text{EBS}(\alpha), \text{EBS}(d))$ at $t_n$ will result in

Figures (6)

Figure 1: Outline of our FDHC negotiation framework. Our method consists of decomposing the extensive form Nash bargaining game into a series of depth-limited subgames. At each subgame we calculate the EBS and apply a human-like strategy which targets this outcome using a MCTS guided by a LLM and value network.
Figure 2: Binned deal price frequencies of 100 negotiations between our baselines and a GPT-4 buyer. Our goal is to achieve deal prices that minimize the difference in payoff between the buyer and seller. In our scenario this amount is minimized at a deal price of $13,000.
Figure 3: Binned deal price frequencies of 30 negotiations between our baselines and a human buyer. Our goal is to achieve deal prices that minimize the difference in payoff between the buyer and seller. In our scenario this amount is minimized at a deal price of $13,000.
Figure 4: Binned deal price frequencies of 50 negotiations between FDHC and a and a series of non-egalitarian buyers. Our goal is to achieve deal prices that minimize the difference in payoff between the buyer and seller. In our scenario this amount is minimized at a deal price of $13,000.
Figure 5: Negotiation scenario for the Buyer
...and 1 more figures

Theorems & Definitions (17)

Theorem 1
Theorem 2
Definition 1
Definition 2
Definition 3
Definition 4
Definition 5
Definition 6
Definition 7
Lemma 1
...and 7 more

A Fairness-Driven Method for Learning Human-Compatible Negotiation Strategies

TL;DR

Abstract

A Fairness-Driven Method for Learning Human-Compatible Negotiation Strategies

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (6)

Theorems & Definitions (17)