Causal Strategic Learning with Competitive Selection
Kiet Q. H. Vo, Muneeb Aadil, Siu Lun Chau, Krikamol Muandet
TL;DR
This work studies causal strategic learning with selection in a sequential, multi-DM setting, showing that optimal selection rules must trade off between selecting the best agents and incentivising their improvement, while often relying on noncausal outcome predictions. It derives an explicit AO selection rule $\bm{\theta}^{AO}=\frac{\bm{\alpha}+\gamma\bm{\mathcal{E}}\bm{\mathcal{E}}^{\top}\bm{\theta}^{*}}{\|\bm{\alpha}+\gamma\bm{\mathcal{E}}\bm{\mathcal{E}}^{\top}\bm{\theta}^{*}\|_2}$ under linearity and boundedness assumptions, and identifies welfare safeguards via a positive cosine between the AO direction and the causal-improvement direction. To recover true causal parameters from observational data under selection, the paper introduces ranking-based strategies and Mean-shift Linear Regression (MSLR), along with a cooperative protocol that enables unbiased estimation when multiple DMs are involved. The experimental results on synthetic admissions data illustrate that AO decisions improve utility and that MSLR yields consistent estimates of $\bm{\theta}^{*}$, even under competitive selection, highlighting the importance of causal modeling and regulatory coordination to mitigate gaming and preserve agent welfare.
Abstract
We study the problem of agent selection in causal strategic learning under multiple decision makers and address two key challenges that come with it. Firstly, while much of prior work focuses on studying a fixed pool of agents that remains static regardless of their evaluations, we consider the impact of selection procedure by which agents are not only evaluated, but also selected. When each decision maker unilaterally selects agents by maximising their own utility, we show that the optimal selection rule is a trade-off between selecting the best agents and providing incentives to maximise the agents' improvement. Furthermore, this optimal selection rule relies on incorrect predictions of agents' outcomes. Hence, we study the conditions under which a decision maker's optimal selection rule will not lead to deterioration of agents' outcome nor cause unjust reduction in agents' selection chance. To that end, we provide an analytical form of the optimal selection rule and a mechanism to retrieve the causal parameters from observational data, under certain assumptions on agents' behaviour. Secondly, when there are multiple decision makers, the interference between selection rules introduces another source of biases in estimating the underlying causal parameters. To address this problem, we provide a cooperative protocol which all decision makers must collectively adopt to recover the true causal parameters. Lastly, we complement our theoretical results with simulation studies. Our results highlight not only the importance of causal modeling as a strategy to mitigate the effect of gaming, as suggested by previous work, but also the need of a benevolent regulator to enable it.
