Learning Potentials for Dynamic Matching and Application to Heart Transplantation
Itai Zilberstein, Ioannis Anagnostides, Zachary W. Sollie, Arman Kilic, Tuomas Sandholm
TL;DR
This work tackles dynamic heart transplant allocation by framing it as online bipartite matching and introducing potential-based policies that balance immediate utility with long-term waitlist value. It replaces prior black-box optimization with an offline imitation-learning framework, training expressive neural-network potentials by mimicking a hindsight omniscient oracle and reinforcing learning with semi-synthetic data. The approach yields substantial improvements over the US status quo, CAS, and myopic baselines on real UNOS data, achieving around 95% of the omniscient upper bound and demonstrating the value of incorporating waitlist state into decision-making. The contributions offer a scalable path toward more effective organ allocation while highlighting trade-offs in fairness and the need for equity-aware extensions in future work.
Abstract
Each year, thousands of patients in need of heart transplants face life-threatening wait times due to organ scarcity. While allocation policies aim to maximize population-level outcomes, current approaches often fail to account for the dynamic arrival of organs and the composition of waitlisted candidates, thereby hampering efficiency. The United States is transitioning from rigid, rule-based allocation to more flexible data-driven models. In this paper, we propose a novel framework for non-myopic policy optimization in general online matching relying on potentials, a concept originally introduced for kidney exchange. We develop scalable and accurate ways of learning potentials that are higher-dimensional and more expressive than prior approaches. Our approach is a form of self-supervised imitation learning: the potentials are trained to mimic an omniscient algorithm that has perfect foresight. We focus on the application of heart transplant allocation and demonstrate, using real historical data, that our policies significantly outperform prior approaches -- including the current US status quo policy and the proposed continuous distribution framework -- in optimizing for population-level outcomes. Our analysis and methods come at a pivotal moment in US policy, as the current heart transplant allocation system is under review. We propose a scalable and theoretically grounded path toward more effective organ allocation.
