A Fairness-Driven Method for Learning Human-Compatible Negotiation Strategies
Ryan Shea, Zhou Yu
TL;DR
FDHC presents a fairness-driven framework for learning human-compatible negotiation strategies by targeting the Egalitarian Bargaining Solution (EBS) within an extensive-form Nash bargaining game. It introduces LGM-Zero, a reinforcement-learning plus search method that combines a pre-trained language model with a value network to retrieve and evaluate human-like offers in large action spaces, guided by Monte Carlo tree search. Theoretical guarantees show convergence to Nash equilibrium under mild assumptions and potential convergence to the EBS under stronger conditions, while experiments demonstrate improved fairness and competitive negotiation quality relative to strong LLM baselines. The approach is demonstrated in a single-issue car bargaining scenario, with automatic and human evaluations indicating FDHC yields more egalitarian outcomes and comparable human-likeness to GPT-4. This work offers a practical, theory-grounded path to human-compatible negotiation agents and suggests future extensions to alternative bargaining solutions and broader negotiation domains.
Abstract
Despite recent advancements in AI and NLP, negotiation remains a difficult domain for AI agents. Traditional game theoretic approaches that have worked well for two-player zero-sum games struggle in the context of negotiation due to their inability to learn human-compatible strategies. On the other hand, approaches that only use human data tend to be domain-specific and lack the theoretical guarantees provided by strategies grounded in game theory. Motivated by the notion of fairness as a criterion for optimality in general sum games, we propose a negotiation framework called FDHC which incorporates fairness into both the reward design and search to learn human-compatible negotiation strategies. Our method includes a novel, RL+search technique called LGM-Zero which leverages a pre-trained language model to retrieve human-compatible offers from large action spaces. Our results show that our method is able to achieve more egalitarian negotiation outcomes and improve negotiation quality.
