Incentivizing Time-Aware Fairness in Data Sharing
Jiangwei Chen, Kieu Thao Nguyen Pham, Rachael Hwee Ling Sim, Arun Verma, Zhaoxuan Wu, Chuan-Sheng Foo, Bryan Kian Hsiang Low
TL;DR
We address asynchronous data sharing in collaborative ML by introducing time-aware incentives that reward early contributions while preserving fairness. The authors formalize incentive conditions and data-valuation requirements, and propose two reward schemes that integrate joining-time information with Shapley-based concepts. Rewards can be realized exactly or approximately via likelihood tempering or subset selection, and empirical results on synthetic and real datasets demonstrate that early joiners receive higher, IR-compliant rewards and that model performance improves with collaboration. The framework balances data value and timing, offering practical mechanisms for motivating timely, high-quality data sharing under non-simultaneous participation. While effective, the approach faces computational and privacy considerations, motivating future work on scalability, privacy-preserving variants, and extensions to repeated or online data sharing.
Abstract
In collaborative data sharing and machine learning, multiple parties aggregate their data resources to train a machine learning model with better model performance. However, as the parties incur data collection costs, they are only willing to do so when guaranteed incentives, such as fairness and individual rationality. Existing frameworks assume that all parties join the collaboration simultaneously, which does not hold in many real-world scenarios. Due to the long processing time for data cleaning, difficulty in overcoming legal barriers, or unawareness, the parties may join the collaboration at different times. In this work, we propose the following perspective: As a party who joins earlier incurs higher risk and encourages the contribution from other wait-and-see parties, that party should receive a reward of higher value for sharing data earlier. To this end, we propose a fair and time-aware data sharing framework, including novel time-aware incentives. We develop new methods for deciding reward values to satisfy these incentives. We further illustrate how to generate model rewards that realize the reward values and empirically demonstrate the properties of our methods on synthetic and real-world datasets.
