Table of Contents
Fetching ...

Probably Correct Optimal Stable Matching for Two-Sided Markets Under Uncertainty

Andreas Athanasopoulos, Anne-Marie George, Christos Dimitrakakis

TL;DR

This work tackles the problem of identifying the Probably Correct Optimal Stable Matching (PCOS) in centralized two-sided markets when the left side's preferences are unknown and feedback is noisy. It departs from regret-minimization by embracing pure exploration, introducing PCOS and several learning algorithms (Uniform Exploration, Elimination, Improved Elimination, Adaptive Sampling) with theoretical sample-complexity guarantees and empirical validation on synthetic data. The key contributions include algorithmic frameworks that learn stable matchings without restricting exploration to only currently stable actions, bounds on sample complexity such as $O\left(\sum_{(p,a) \in m_s^{\star}} \frac{\ln\left(KN/\delta \Delta_{p,a}\right)}{\Delta_{p,a}^2}\right)$ in its variants, and practical evidence that adaptive sampling yields superior performance. The results advance understanding of efficient information gathering in two-sided markets under uncertainty and offer practical avenues for fast, high-probability identification of the optimal stable matching in centralized settings.

Abstract

We consider a learning problem for the stable marriage model under unknown preferences for the left side of the market. We focus on the centralized case, where at each time step, an online platform matches the agents, and obtains a noisy evaluation reflecting their preferences. Our aim is to quickly identify the stable matching that is left-side optimal, rendering this a pure exploration problem with bandit feedback. We specifically aim to find Probably Correct Optimal Stable Matchings and present several bandit algorithms to do so. Our findings provide a foundational understanding of how to efficiently gather and utilize preference information to identify the optimal stable matching in two-sided markets under uncertainty. An experimental analysis on synthetic data complements theoretical results on sample complexities for the proposed methods.

Probably Correct Optimal Stable Matching for Two-Sided Markets Under Uncertainty

TL;DR

This work tackles the problem of identifying the Probably Correct Optimal Stable Matching (PCOS) in centralized two-sided markets when the left side's preferences are unknown and feedback is noisy. It departs from regret-minimization by embracing pure exploration, introducing PCOS and several learning algorithms (Uniform Exploration, Elimination, Improved Elimination, Adaptive Sampling) with theoretical sample-complexity guarantees and empirical validation on synthetic data. The key contributions include algorithmic frameworks that learn stable matchings without restricting exploration to only currently stable actions, bounds on sample complexity such as in its variants, and practical evidence that adaptive sampling yields superior performance. The results advance understanding of efficient information gathering in two-sided markets under uncertainty and offer practical avenues for fast, high-probability identification of the optimal stable matching in centralized settings.

Abstract

We consider a learning problem for the stable marriage model under unknown preferences for the left side of the market. We focus on the centralized case, where at each time step, an online platform matches the agents, and obtains a noisy evaluation reflecting their preferences. Our aim is to quickly identify the stable matching that is left-side optimal, rendering this a pure exploration problem with bandit feedback. We specifically aim to find Probably Correct Optimal Stable Matchings and present several bandit algorithms to do so. Our findings provide a foundational understanding of how to efficiently gather and utilize preference information to identify the optimal stable matching in two-sided markets under uncertainty. An experimental analysis on synthetic data complements theoretical results on sample complexities for the proposed methods.
Paper Structure (19 sections, 7 theorems, 9 equations, 4 figures, 4 algorithms)

This paper contains 19 sections, 7 theorems, 9 equations, 4 figures, 4 algorithms.

Key Result

proposition 1

Let $m_{\hat{\pi}}$ be the output of the DA algorithm running using estimated preferences $\hat{\pi}$ for the agents $\mathfrak{A}$. The probability that $m_{\hat{\pi}}$ is equal to the optimal stable matching $m_s^{\star}$ for the true preferences $\pi$ is at least as high as the probability of tho

Figures (4)

  • Figure 1: Sample complexity for the proposed algorithms for the two different reward settings, averaged over the runs.
  • Figure 2: Any-time performance of the algorithm for the first instance with 20 agents on each side. The figure illustrates the average number of times the algorithms are able to identify (left) the optimal stable matching, (middle) the correct preferences up to the stable match for every player, and (right) the correct preferences for every player, after each matching.
  • Figure 3: Anytime performance of the algorithm for the first instance setting with varying numbers of agents on each side. The figure illustrates the average number of times the algorithm identifies (left) the optimal stable matching, (middle) the correct preferences up to the stable match for each player, and (right) the correct preferences for every player after each matching.
  • Figure 4: Anytime performance of the algorithm for the second instance setting with varying numbers of agents on each side. The figure illustrates the average number of times the algorithm identifies (left) the optimal stable matching, (middle) the correct preferences up to the stable match for each player, and (right) the correct preferences for every player after each matching.

Theorems & Definitions (14)

  • Definition 1: Probably Correct Optimal Stable Matching
  • Remark 1
  • Definition 2: Completely Correct Preferences
  • Definition 3: Partially Correct Preferences up to an Arm
  • proposition 1
  • Theorem 1
  • Lemma 1
  • Theorem 2
  • Remark 2
  • Remark 3
  • ...and 4 more