To Spend or to Gain: Online Learning in Repeated Karma Auctions
Damien Berriaud, Ezzat Elokda, Devansh Jalota, Emilio Frazzoli, Marco Pavone, Florian Dörfler
TL;DR
This work tackles online learning in karma-based repeated resource auctions where artificial currency is redistributed each period. It develops adaptive karma pacing, an online dual gradient ascent-like strategy, and proves that it achieves asymptotic optimality for a single bidder, induces convergent learning when all bidders adopt it, and forms an approximate Nash equilibrium in large-population, parallel auctions. The analysis addresses unique challenges from currency gains and redistribution, including nontruthfulness upon losing, budget balance under Karma, and non-uniqueness of stationary multipliers, requiring a novel relaxed-dual with a shiftedprojection. The results provide principled, scalable bidding rules for practical karma mechanisms, with welfare implications suggesting efficient allocation without external money and robustness to heterogeneity across agents. The work also charts future directions for alternative karma redistribution schemes and adaptive step-size strategies to mitigate the vanishing-box problem and enhance convergence in realistic settings.
Abstract
Recent years have seen a surge of artificial currency-based mechanisms in contexts where monetary instruments are deemed unfair or inappropriate, e.g., in allocating food donations to food banks, course seats to students, and, more recently, even for traffic congestion management. Yet the applicability of these mechanisms remains limited in repeated auction settings, as it is challenging for users to learn how to bid an artificial currency that has no value outside the auctions. Indeed, users must jointly learn the value of the currency in addition to how to spend it optimally. Moreover, in the prominent class of karma mechanisms, in which artificial karma payments are redistributed to users at each time step, users do not only spend karma to obtain public resources but also gain karma for yielding them. For this novel class of karma auctions, we propose an adaptive karma pacing strategy that learns to bid optimally, and show that this strategy a) is asymptotically optimal for a single user bidding against competing bids drawn from a stationary distribution; b) leads to convergent learning dynamics when all users adopt it; and c) constitutes an approximate Nash equilibrium as the number of users grows. Our results require a novel analysis in comparison to adaptive pacing strategies in monetary auctions, since we depart from the classical assumption that the currency has known value outside the auctions, and consider that the currency is both spent and gained through the redistribution of payments.
