Reducing Optimism Bias in Incomplete Cooperative Games
Filip Úradník, David Sychrovský, Jakub Černý, Martin Černý
TL;DR
This work studies optimism bias in incomplete cooperative games by introducing the utopian gap $\mathcal{G}_{(N,v)}(\mathcal{K})$, which bounds the discrepancy between potential Shapley-based payoffs across feasible $\mathbb{S}^n$-extensions and the grand coalition value $v(N)$. A principal is assumed to sequentially reveal coalition values under a budget, with offline and online formulations under a known prior $\mathcal{F}$ to minimize the expected gap. The authors develop Offline Optimal and Offline Greedy algorithms and employ PPO-based online learning to discover revealing policies, demonstrating the benefit of revealing larger coalitions, especially in supermodular settings where nearly $\mathcal{O}(n)$ revelations can capture most information. Empirical results on factory- and supermodular-type games show substantial gap reductions and highlight practical revelation patterns, providing a geometric interpretation of uncertainty as a bounded extension space and informing SHAP-like explanatory contexts in cooperative AI.
Abstract
Cooperative game theory has diverse applications in contemporary artificial intelligence, including domains like interpretable machine learning, resource allocation, and collaborative decision-making. However, specifying a cooperative game entails assigning values to exponentially many coalitions, and obtaining even a single value can be resource-intensive in practice. Yet simply leaving certain coalition values undisclosed introduces ambiguity regarding individual contributions to the collective grand coalition. This ambiguity often leads to players holding overly optimistic expectations, stemming from either inherent biases or strategic considerations, frequently resulting in collective claims exceeding the actual grand coalition value. In this paper, we present a framework aimed at optimizing the sequence for revealing coalition values, with the overarching goal of efficiently closing the gap between players' expectations and achievable outcomes in cooperative games. Our contributions are threefold: (i) we study the individual players' optimistic completions of games with missing coalition values along with the arising gap, and investigate its analytical characteristics that facilitate more efficient optimization; (ii) we develop methods to minimize this gap over classes of games with a known prior by disclosing values of additional coalitions in both offline and online fashion; and (iii) we empirically demonstrate the algorithms' performance in practical scenarios, together with an investigation into the typical order of revealing coalition values.
