Differentiable Normative Guidance for Nash Bargaining Solution Recovery

Moirangthem Tiken Singh; Surajit Borkotokey; Rajnish Kumar

Differentiable Normative Guidance for Nash Bargaining Solution Recovery

Moirangthem Tiken Singh, Surajit Borkotokey, Rajnish Kumar

Abstract

Autonomous artificial intelligence agents in negotiation systems must generate equitable utility allocations satisfying individual rationality (IR), ensuring each agent receives at least its outside option, and the Nash Bargaining Solution (NBS), which maximizes joint surplus. Existing generative models often learn suboptimal human behaviors, producing solutions far from Pareto efficiency, while classical methods require full Pareto frontier knowledge, which is unavailable in real datasets. We propose a guided graph diffusion framework that generates individually rational utility vectors while approximating the NBS without frontier knowledge at inference time. Negotiations are modeled as directed graphs with graph attention capturing asymmetric agent attributes, and a conditional diffusion model maps these to utility vectors. A differentiable composite guidance loss, applied in the final reverse diffusion steps, penalizes IR violations and Nash product gaps. We prove that, under sufficient penalty weighting, solutions enter the IR region in finite time. Across datasets, the method achieves 100% IR compliance. Nash efficiency reaches 99.45% on synthetic data (within 0.55 percentage points of an oracle), and 54.24% (CaSiNo) and 88.67% (Deal or No Deal), improving 20-60 percentage points over unconstrained generative baselines.

Differentiable Normative Guidance for Nash Bargaining Solution Recovery

Abstract

Differentiable Normative Guidance for Nash Bargaining Solution Recovery

Abstract

Paper Structure

Table of Contents

Key Result

Figures (6)

Theorems & Definitions (4)