The Careless Coupon Collector's Problem

Emilio Cruciani; Aditi Dudeja

The Careless Coupon Collector's Problem

Emilio Cruciani, Aditi Dudeja

TL;DR

The number of rounds required to complete the collection as a function of $n$ and $p are analyzed and an algorithm is given that computes the expected completion time of CCCP in $O(n^2)$ time.

Abstract

We initiate the study of the Careless Coupon Collector's Problem (CCCP), a novel variation of the classical coupon collector, that we envision as a model for information systems such as web crawlers, dynamic caches, and fault-resilient networks. In CCCP, a collector attempts to gather $n$ distinct coupon types by obtaining one coupon type uniformly at random in each discrete round, however the collector is \textit{careless}: at the end of each round, each collected coupon type is independently lost with probability $p$. We analyze the number of rounds required to complete the collection as a function of $n$ and $p$. In particular, we show that it transitions from $Θ(n \ln n)$ when $p = o\big(\frac{\ln n}{n^2}\big)$ up to $Θ\big((\frac{np}{1-p})^n\big)$ when $p=ω\big(\frac{1}{n}\big)$ in multiple distinct phases. Interestingly, when $p=\frac{c}{n}$, the process remains in a metastable phase, where the fraction of collected coupon types is concentrated around $\frac{1}{1+c}$ with probability $1-o(1)$, for a time window of length $e^{Θ(n)}$. Finally, we give an algorithm that computes the expected completion time of CCCP in $O(n^2)$ time.

The Careless Coupon Collector's Problem

TL;DR

The number of rounds required to complete the collection as a function of

and

O(n^2)$ time.

Abstract

distinct coupon types by obtaining one coupon type uniformly at random in each discrete round, however the collector is \textit{careless}: at the end of each round, each collected coupon type is independently lost with probability

. We analyze the number of rounds required to complete the collection as a function of

and

. In particular, we show that it transitions from

when

up to

when

in multiple distinct phases. Interestingly, when

, the process remains in a metastable phase, where the fraction of collected coupon types is concentrated around

with probability

, for a time window of length

. Finally, we give an algorithm that computes the expected completion time of CCCP in

time.

Paper Structure (11 sections, 17 theorems, 73 equations, 3 figures)

This paper contains 11 sections, 17 theorems, 73 equations, 3 figures.

Introduction
Related work
Applications
Preliminaries
Some observations on CCCP
Marginal probabilities
Metastability
Hitting time
Exact computation
Mean-field analysis
Asymptotic bounds

Key Result

Proposition 2

Let $(S_t^{(p_1)})_{t \ge 0}$ and $(S_t^{(p_2)})_{t \ge 0}$ be CCCP chains with loss probabilities $p_1$ and $p_2$. If $p_1 \le p_2$ then $T_{n,p_1} \preceq T_{n,p_2}$.

Figures (3)

Figure 1: An illustration of a careless coupon collector.
Figure 2: Empirical hitting time and metastable fractions of coupons of CCCP, averaged over 1000 repetitions. On the left: average hitting time while varying $p$, for fixed $n=10$; the small value of $n$ allows to wait exponential times for the process to end. On the right: average fraction $|S_t| / n$ of coupons in the collection at time $t$, for several values of $p$.
Figure 3: A graphical representation of the reduced Markov Chain of CCCP with $n=3$ coupons (left) and its corresponding lower-Hessenberg transition matrix with first superdiagonal entries in magenta (right).

Theorems & Definitions (34)

Proposition 2
proof
Lemma 5
proof
Lemma 6
proof
Corollary 7
proof
Lemma 8
proof
...and 24 more

The Careless Coupon Collector's Problem

TL;DR

Abstract

The Careless Coupon Collector's Problem

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (34)