Table of Contents
Fetching ...

Card guessing and the birthday problem for sampling without replacement

Jimmy He, Andrea Ottolini

Abstract

Consider a uniformly random deck consisting of cards labelled by numbers from $1$ through $n$, possibly with repeats. A guesser guesses the top card, after which it is revealed and removed and the game continues. What is the expected number of correct guesses under the best and worst strategies? We establish sharp asymptotics for both strategies. For the worst case, this answers a recent question of Diaconis, Graham, He and Spiro, who found the correct order. As part of the proof, we study the birthday problem for sampling without replacement using Stein's method.

Card guessing and the birthday problem for sampling without replacement

Abstract

Consider a uniformly random deck consisting of cards labelled by numbers from through , possibly with repeats. A guesser guesses the top card, after which it is revealed and removed and the game continues. What is the expected number of correct guesses under the best and worst strategies? We establish sharp asymptotics for both strategies. For the worst case, this answers a recent question of Diaconis, Graham, He and Spiro, who found the correct order. As part of the proof, we study the birthday problem for sampling without replacement using Stein's method.

Paper Structure

This paper contains 18 sections, 17 theorems, 108 equations, 5 figures, 1 table.

Key Result

Theorem 1.1

Consider a deck with $n$ distinct card types, with multiplicities $\bold m=(m_1,\dotsc, m_n)$. Let $\epsilon\in (0,1]$ be the fraction of types that appear with multiplicity $m^*=\max m_i$. Then where $H_n=1+\dotsc\frac{1}{n}$, and the implicit constant depends on $m^*$ and $\epsilon$.

Figures (5)

  • Figure 1: Comparison of approximation of $\mathbb{E}[S_\mathbf{m}^+]$ given by Theorem \ref{['thm: best case, weakversion']} with simulated mean (10,000 trials), for $\mathbf{m}=m\mathbf{1}_n$ with $m=2,\dotsc,6$, and $n=2,\dotsc, 100$. Dots represent the simulated means and solid lines represent $H_mH_n+\sum \ln\gamma_j$.
  • Figure 2: Comparison of approximation of $\mathbb{E}[S_\mathbf{m}^+]$ given by Theorem \ref{['thm: best case, weakversion']} and the approximation $H_m \ln n$ from diaconis2020card with simulated mean (10,000 trials), for $\mathbf{m}=m\mathbf{1}_n$ with $m=4,5,6$, and $n=2,\dotsc, 100$. Dots represent the simulated means, solid lines represent $H_mH_n+m\ln m+\sum \frac{\ln \beta_j}{j}$, and dashed lines represent $H_m \ln n$.
  • Figure 3: Comparison of approximation of $\mathbb{E}[S_\mathbf{m}^+]$ given by Theorem \ref{['thm: best case, weakversion']} with simulated mean (10,000 trials), for a deck with $n/2$ types of multiplicity $m_1$ and $n/2$ types of multiplicity $m_2$. Data is shown for $m_1,m_2=6, 8, 10, 12$, and $n=2,4,\dotsc, 100$. Dots represent the simulated means and solid lines represent $H_{m^*}H_n+\sum \ln \gamma_j$.
  • Figure 4: Comparison of approximation of $\mathbb{E}[S_\mathbf{m}^-]$ given by Theorem \ref{['thm: worst case']} with simulated mean (10,000 trials), for $\mathbf{m}=m\mathbf{1}_n$ with $m=2,\dotsc,6$, and $n=10, 20, \dotsc, 1000$. Dots represent the simulated means and solid lines represent $\sum_{j=\lfloor m/2\rfloor+1}^{m}\Gamma\left(\frac{j+1}{j}\right)\gamma_j^{-1}n^{-\frac{1}{j}}$.
  • Figure 5: Comparison of approximation of $\mathbb{E}[S_\mathbf{m}^-]$ given by Theorem \ref{['thm: worst case']} and given by including extra terms from $j=1$ to $\lfloor m/2\rfloor$ with simulated mean (10,000 trials), for $\mathbf{m}=m\mathbf{1}_n$ with $m=2,\dotsc,6$, and $n=10, 20, \dotsc, 1000$. Dots represent the simulated means, solid lines represent $\sum_{j=\lfloor m/2\rfloor+1}^{m}\Gamma\left(\frac{j+1}{j}\right)m^{-1}\gamma_j^{-1}n^{-\frac{1}{j}}$, and dashed lines represent $\sum_{j=1}^{m}\Gamma\left(\frac{j+1}{j}\right)\gamma_j^{-1}n^{-\frac{1}{j}}$.

Theorems & Definitions (41)

  • Theorem 1.1
  • Theorem 1.2
  • Remark 1.3
  • Remark 1.4
  • Remark 1.5
  • Theorem 1.6
  • Remark 1.7
  • Theorem 1.8
  • Remark 1.9
  • Theorem 3.1: barbour1992poisson
  • ...and 31 more