Table of Contents
Fetching ...

Putting Gale & Shapley to Work: Guaranteeing Stability Through Learning

Hadi Hosseini, Sanjukta Roy, Duohan Zhang

TL;DR

This paper exploits the structure of stable solutions to devise algorithms that improve the likelihood of finding stable solutions, and begins the study of the sample complexity of finding a stable matching, and provides theoretical bounds on the number of samples needed to reach a stable matching with high probability.

Abstract

Two-sided matching markets describe a large class of problems wherein participants from one side of the market must be matched to those from the other side according to their preferences. In many real-world applications (e.g. content matching or online labor markets), the knowledge about preferences may not be readily available and must be learned, i.e., one side of the market (aka agents) may not know their preferences over the other side (aka arms). Recent research on online settings has focused primarily on welfare optimization aspects (i.e. minimizing the overall regret) while paying little attention to the game-theoretic properties such as the stability of the final matching. In this paper, we exploit the structure of stable solutions to devise algorithms that improve the likelihood of finding stable solutions. We initiate the study of the sample complexity of finding a stable matching, and provide theoretical bounds on the number of samples needed to reach a stable matching with high probability. Finally, our empirical results demonstrate intriguing tradeoffs between stability and optimality of the proposed algorithms, further complementing our theoretical findings.

Putting Gale & Shapley to Work: Guaranteeing Stability Through Learning

TL;DR

This paper exploits the structure of stable solutions to devise algorithms that improve the likelihood of finding stable solutions, and begins the study of the sample complexity of finding a stable matching, and provides theoretical bounds on the number of samples needed to reach a stable matching with high probability.

Abstract

Two-sided matching markets describe a large class of problems wherein participants from one side of the market must be matched to those from the other side according to their preferences. In many real-world applications (e.g. content matching or online labor markets), the knowledge about preferences may not be readily available and must be learned, i.e., one side of the market (aka agents) may not know their preferences over the other side (aka arms). Recent research on online settings has focused primarily on welfare optimization aspects (i.e. minimizing the overall regret) while paying little attention to the game-theoretic properties such as the stability of the final matching. In this paper, we exploit the structure of stable solutions to devise algorithms that improve the likelihood of finding stable solutions. We initiate the study of the sample complexity of finding a stable matching, and provide theoretical bounds on the number of samples needed to reach a stable matching with high probability. Finally, our empirical results demonstrate intriguing tradeoffs between stability and optimality of the proposed algorithms, further complementing our theoretical findings.
Paper Structure (23 sections, 13 theorems, 32 equations, 4 figures, 1 table, 3 algorithms)

This paper contains 23 sections, 13 theorems, 32 equations, 4 figures, 1 table, 3 algorithms.

Key Result

Theorem 1

Assume that the true preferences satisfy uniqueness consistency condition. For any estimated utility $\hat{\mu}$, if the agent-proposing DA algorithm produces a stable matching, then the arm-proposing DA algorithm produces a stable matching.

Figures (4)

  • Figure 1: $95 \%$ confidence interval of stability and regret for $200$ randomized general preference profiles.
  • Figure 2: 95% confidence interval of stability and regrets for 200 randomized SPC preference profiles. Please see the definition of SPC in \ref{['sec:supplemental-experiment']}. An SPC preference profile has a unique stable matching.
  • Figure 3: $95\%$ confidence interval of agent-pessimal stable regrets for $200$ randomized general preference profiles.
  • Figure 4: $95 \%$ confidence interval of stability and regrets for $200$ randomized agent masterlist preference profiles.

Theorems & Definitions (30)

  • Example 1
  • Theorem 1
  • proof
  • Corollary 1
  • Example 2: The stability of arm vs. agent proposing DA when estimation is wrong.
  • Definition 4.1
  • Lemma 1
  • proof
  • Theorem 2
  • proof
  • ...and 20 more