Table of Contents
Fetching ...

The Careless Coupon Collector's Problem

Emilio Cruciani, Aditi Dudeja

TL;DR

The number of rounds required to complete the collection as a function of $n$ and $p are analyzed and an algorithm is given that computes the expected completion time of CCCP in $O(n^2)$ time.

Abstract

We initiate the study of the Careless Coupon Collector's Problem (CCCP), a novel variation of the classical coupon collector, that we envision as a model for information systems such as web crawlers, dynamic caches, and fault-resilient networks. In CCCP, a collector attempts to gather $n$ distinct coupon types by obtaining one coupon type uniformly at random in each discrete round, however the collector is \textit{careless}: at the end of each round, each collected coupon type is independently lost with probability $p$. We analyze the number of rounds required to complete the collection as a function of $n$ and $p$. In particular, we show that it transitions from $Θ(n \ln n)$ when $p = o\big(\frac{\ln n}{n^2}\big)$ up to $Θ\big((\frac{np}{1-p})^n\big)$ when $p=ω\big(\frac{1}{n}\big)$ in multiple distinct phases. Interestingly, when $p=\frac{c}{n}$, the process remains in a metastable phase, where the fraction of collected coupon types is concentrated around $\frac{1}{1+c}$ with probability $1-o(1)$, for a time window of length $e^{Θ(n)}$. Finally, we give an algorithm that computes the expected completion time of CCCP in $O(n^2)$ time.

The Careless Coupon Collector's Problem

TL;DR

The number of rounds required to complete the collection as a function of and O(n^2)$ time.

Abstract

We initiate the study of the Careless Coupon Collector's Problem (CCCP), a novel variation of the classical coupon collector, that we envision as a model for information systems such as web crawlers, dynamic caches, and fault-resilient networks. In CCCP, a collector attempts to gather distinct coupon types by obtaining one coupon type uniformly at random in each discrete round, however the collector is \textit{careless}: at the end of each round, each collected coupon type is independently lost with probability . We analyze the number of rounds required to complete the collection as a function of and . In particular, we show that it transitions from when up to when in multiple distinct phases. Interestingly, when , the process remains in a metastable phase, where the fraction of collected coupon types is concentrated around with probability , for a time window of length . Finally, we give an algorithm that computes the expected completion time of CCCP in time.
Paper Structure (11 sections, 17 theorems, 73 equations, 3 figures)

This paper contains 11 sections, 17 theorems, 73 equations, 3 figures.

Key Result

Proposition 2

Let $(S_t^{(p_1)})_{t \ge 0}$ and $(S_t^{(p_2)})_{t \ge 0}$ be CCCP chains with loss probabilities $p_1$ and $p_2$. If $p_1 \le p_2$ then $T_{n,p_1} \preceq T_{n,p_2}$.

Figures (3)

  • Figure 1: An illustration of a careless coupon collector.
  • Figure 2: Empirical hitting time and metastable fractions of coupons of CCCP, averaged over 1000 repetitions. On the left: average hitting time while varying $p$, for fixed $n=10$; the small value of $n$ allows to wait exponential times for the process to end. On the right: average fraction $|S_t| / n$ of coupons in the collection at time $t$, for several values of $p$.
  • Figure 3: A graphical representation of the reduced Markov Chain of CCCP with $n=3$ coupons (left) and its corresponding lower-Hessenberg transition matrix with first superdiagonal entries in magenta (right).

Theorems & Definitions (34)

  • Proposition 2
  • proof
  • Lemma 5
  • proof
  • Lemma 6
  • proof
  • Corollary 7
  • proof
  • Lemma 8
  • proof
  • ...and 24 more