Learning Optimal Stable Matches in Decentralized Markets with Unknown Preferences

Vade Shah; Bryce L. Ferguson; Jason R. Marden

Learning Optimal Stable Matches in Decentralized Markets with Unknown Preferences

Vade Shah, Bryce L. Ferguson, Jason R. Marden

TL;DR

This work tackles decentralized two-sided matching when proposers have unknown preferences. It introduces a completely uncoupled online learning rule $\Gamma$ that, for any $0\le p<1$, drives the system toward the proposer-optimal stable match $\mu^*$ with probability at least $p$ for large times, without central coordination or assumptions on market structure. The authors prove that $\mu^*$ is the unique stochastically stable state by leveraging regular perturbed Markov processes and the lattice structure of stable matches, and they validate the approach via simulations in small markets. The results generalize Gale–Shapley to information-poor, decentralized settings and offer a principled baseline for future, faster-converging decentralized matching algorithms with provable guarantees.

Abstract

Matching algorithms have demonstrated great success in several practical applications, but they often require centralized coordination and plentiful information. In many modern online marketplaces, agents must independently seek out and match with another using little to no information. For these kinds of settings, can we design decentralized, limited-information matching algorithms that preserve the desirable properties of standard centralized techniques? In this work, we constructively answer this question in the affirmative. We model a two-sided matching market as a game consisting of two disjoint sets of agents, referred to as proposers and acceptors, each of whom seeks to match with their most preferable partner on the opposite side of the market. However, each proposer has no knowledge of their own preferences, so they must learn their preferences while forming matches in the market. We present a simple online learning rule that guarantees a strong notion of probabilistic convergence to the welfare-maximizing equilibrium of the game, referred to as the proposer-optimal stable match. To the best of our knowledge, this represents the first completely decoupled, communication-free algorithm that guarantees probabilistic convergence to an optimal stable match, irrespective of the structure of the matching market.

Learning Optimal Stable Matches in Decentralized Markets with Unknown Preferences

TL;DR

This work tackles decentralized two-sided matching when proposers have unknown preferences. It introduces a completely uncoupled online learning rule

that, for any

, drives the system toward the proposer-optimal stable match

with probability at least

for large times, without central coordination or assumptions on market structure. The authors prove that

is the unique stochastically stable state by leveraging regular perturbed Markov processes and the lattice structure of stable matches, and they validate the approach via simulations in small markets. The results generalize Gale–Shapley to information-poor, decentralized settings and offer a principled baseline for future, faster-converging decentralized matching algorithms with provable guarantees.

Abstract

Paper Structure (12 sections, 5 theorems, 7 equations, 3 figures, 2 algorithms)

This paper contains 12 sections, 5 theorems, 7 equations, 3 figures, 2 algorithms.

Introduction
Model
Market Model
Game Model
Learning Model and Main Result
Description of Learning Rule
Simulation
Conclusion
Proof of Theorem
Stochastic stability
Matching theory
Proof

Key Result

Theorem 1

Consider the matching game defined by the market $\mathbf{M}$. For every probability $0 \leq p < 1$, there exists a learning rule $\Gamma$ of the form defined in eq:learning_rule such that, if every proposer selects their action according to $\Gamma$, then for all sufficiently large timesteps $t$, t

Figures (3)

Figure 1: Two-sided matching market models with full (top) and partial information (bottom) in centralized (left) and decentralized environments (right). In a full information model, proposers know their own preferences completely, but in a partial information model, they do not. In a centralized environment, agents provide their preferences (or a subset thereof) to a central algorithm that assigns a match, but in a decentralized environment, proposers directly propose to acceptors to form a match.
Figure 2: A market with three proposers (left) and three acceptors (right). Each agents' preference ordering is shown as a list beneath their icon. The arrows indicate to whom each proposer proposes; this configuration corresponds to the proposer-optimal stable match in which every proposer is matched with their favorite acceptor.
Figure 3: When using the learning rule, the empirical frequency of reaching any stable match (SM) and the POSM increases with the number of timesteps when every agent follows the learning rule with parameters $\epsilon = 0.001$, $F(u) = -0.49 \exp (-4u)$, and $G(\Delta u) = -0.49 \exp (-4 \Delta u)$.

Theorems & Definitions (6)

Theorem 1
Lemma 1: Young, 1993 young1993evolution
Lemma 2: Ackermann et al., 2008, ackermann2008uncoordinated
Lemma 3
Corollary 1
proof

Learning Optimal Stable Matches in Decentralized Markets with Unknown Preferences

TL;DR

Abstract

Learning Optimal Stable Matches in Decentralized Markets with Unknown Preferences

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (6)