Data-Driven Mechanism Design using Multi-Agent Revealed Preferences
Luke Snow, Vikram Krishnamurthy
TL;DR
This work introduces a data-driven mechanism-design framework for a sequence of independent one-shot games where the designer observes only equilibrium actions and has no access to agent utilities. It casts mechanism design as a revealed-preference problem, deriving necessary and sufficient linear-inequality conditions (MM-GARP) that certify when observed mixed-strategy equilibria can be socially optimal, and defines a Pareto gap loss L(φ) that an SPSA-based Algorithm 1 minimizes to either achieve social optimality or certify its impossibility. The framework connects to robust revealed-preference metrics (CCEI, GARP_F) and extends to finite-sample settings through a distributionally robust RL formulation, reformulated as a semi-infinite program solvable via exchange methods. Two numerical experiments—wireless spectrum sharing and river-pollution games—demonstrate rapid convergence of the Pareto gap and tangible welfare gains using only equilibrium data. Overall, the paper provides a rigorous, data-driven pathway to design mechanisms that steer decentralized, utility-unknown agents toward socially desirable outcomes, with principled guarantees and finite-sample robustness.
Abstract
We study a sequence of independent one-shot non-cooperative games where agents play equilibria determined by a tunable mechanism. Observing only equilibrium decisions, without parametric or distributional knowledge of utilities, we aim to steer equilibria towards social optimality, and to certify when this is impossible due to the game's structure. We develop an adaptive RL framework for this mechanism design objective. First, we derive a multi-agent revealed-preference test for Pareto optimality that gives necessary and sufficient conditions for the existence of utilities under which the empirically observed mixed-strategy Nash equilibria are socially optimal. The conditions form a tractable linear program. Using this, we build an IRL step that computes the Pareto gap, the distance of observed strategies from Pareto optimality, and couple it with a policy-gradient update. We prove convergence to a mechanism that globally minimizes the Pareto gap. This yields a principled achievability test: if social optimality is attainable for the given game and observed equilibria, Algorithm 1 attains it; otherwise, the algorithm certifies unachievability while converging to the mechanism closest to social optimality. We also show a tight link between our loss and robust revealed-preference metrics, allowing algorithmic suboptimality to be interpreted through established microeconomic notions. Finally, when only finitely many i.i.d. samples from mixed strategies (partial strategy specifications) are available, we derive concentration bounds for convergence and design a distributionally robust RL procedure that attains the mechanism-design objective for the fully specified strategies.
