Remember, but also, Forget: Bridging Myopic and Perfect Recall Fairness with Past-Discounting
Ashwin Kumar, William Yeoh
TL;DR
The paper addresses fairness in dynamic, multi-agent resource allocation by introducing past-discounted historical utilities. It defines a tunable decay factor $\gamma_p$ to interpolate between instantaneous and perfect-recall fairness, and shows that this approach yields a bounded augmented state space, enhancing computational tractability for sequential decision-making. Formulations are provided for both additive and averaged utilities, with theoretical justification and comparative analysis against traditional paradigms. The work aims to better align fairness with human temporal judgments while enabling scalable reinforcement learning in dynamic settings.
Abstract
Dynamic resource allocation in multi-agent settings often requires balancing efficiency with fairness over time--a challenge inadequately addressed by conventional, myopic fairness measures. Motivated by behavioral insights that human judgments of fairness evolve with temporal distance, we introduce a novel framework for temporal fairness that incorporates past-discounting mechanisms. By applying a tunable discount factor to historical utilities, our approach interpolates between instantaneous and perfect-recall fairness, thereby capturing both immediate outcomes and long-term equity considerations. Beyond aligning more closely with human perceptions of fairness, this past-discounting method ensures that the augmented state space remains bounded, significantly improving computational tractability in sequential decision-making settings. We detail the formulation of discounted-recall fairness in both additive and averaged utility contexts, illustrate its benefits through practical examples, and discuss its implications for designing balanced, scalable resource allocation strategies.
