Nash Incentive-compatible Online Mechanism Learning via Weakly Differentially Private Online Learning
Joon Suk Huh, Kirthevasan Kandasamy
TL;DR
The paper tackles online mechanism design where agents interact over multiple rounds, showing that single-round incentive compatibility does not guarantee truthfulness during learning. It introduces a Nash incentive-compatible online mechanism learning framework that randomly blends a weakly differentially private online learner (based on Hedge) with a commitment mechanism to deter misreports, achieving sublinear regret in adversarial settings with long-sighted agents. A key insight is that a weaker DP notion suffices for NIC, enabling a logarithmic dependence on the mechanism class size; the Hedge-based construction yields regret $O((\log|\Pi|)\cdot T^{(1+h)/2})$ for $h\in[0,1)$ and gives concrete regret bounds for mechanism classes beyond auctions, including online facility location and VCG with ex-post externalities. The framework is simple, extends to multiple learnable parameters, and preserves NIC without requiring prior distributions over agent types, offering a scalable approach to general mechanism design in online adversarial environments.
Abstract
We study a multi-round mechanism design problem, where we interact with a set of agents over a sequence of rounds. We wish to design an incentive-compatible (IC) online learning scheme to maximize an application-specific objective within a given class of mechanisms, without prior knowledge of the agents' type distributions. Even if each mechanism in this class is IC in a single round, if an algorithm naively chooses from this class on each round, the entire learning process may not be IC against non-myopic buyers who appear over multiple rounds. On each round, our method randomly chooses between the recommendation of a weakly differentially private online learning algorithm (e.g., Hedge), and a commitment mechanism which penalizes non-truthful behavior. Our method is IC and achieves $O(T^{\frac{1+h}{2}})$ regret for the application-specific objective in an adversarial setting, where $h$ quantifies the long-sightedness of the agents. When compared to prior work, our approach is conceptually simpler,it applies to general mechanism design problems (beyond auctions), and its regret scales gracefully with the size of the mechanism class.
