Table of Contents
Fetching ...

Dixie cup problem in an interlacing process

Aristides V. Doumas

TL;DR

This paper studies a two-component Dixie cup coupon collector where the coupon types are formed by interlacing two distributions, one of which is rarer. Using Poissonization and Euler–Maclaurin methods, it derives the leading-term asymptotics of $E\left[T_{m}(N;\alpha)\right]$ as $N\to\infty$ under broad growth conditions on the two sequences, showing that both distributions contribute to the limit via a key integral expression $E\left[T_{m}(N;\alpha)\right]\sim (D_M+B_M)\int_{0}^{\infty}\left[1-\prod_{j=1}^{M}\left(1-S_m(d_ju)e^{-d_ju}\right)\right]du$, with $D_M=\sum_{j=1}^{M}d_j$ and $B_M=\sum_{j=1}^{M}b_j$. The work extends to more subfamilies and discusses rising moments, providing concrete examples (e.g., Zipf vs exponential) and a table of leading terms, complemented by simulations. The findings illuminate how heterogeneity in coupon probabilities shapes the expected time to collect $m$ complete sets and generalize classic coupon-collector results to interlaced distributions. The approach and results have potential applications in areas where heterogeneous rare events govern discovery processes.

Abstract

The double Dixie cup problem of D.J. Newman and L. Shepp is a well-known variant of the coupon collector problem, where the object of study is the number of coupons that a collector has to buy in order to complete m sets of all N existing different coupons. In this paper we consider the case where the coupons distribution is a mixture of two different distributions, where the coupons from the first distribution are far rarer than the ones coming from the second. We apply a Poissonization technique, as well as well known results and techniques from our previous work, to derive the asymptotics (leading term) of the expectation of the above random variable as N goes to infinity for large classes of distributions. As it turns out, both distributions contribute to this result. The leading asymptotics of the rising moments of the aforementioned random variable are also discussed. We conclude by generalizing the problem to the case where the family of coupons is a mixture of j subfamilies.

Dixie cup problem in an interlacing process

TL;DR

This paper studies a two-component Dixie cup coupon collector where the coupon types are formed by interlacing two distributions, one of which is rarer. Using Poissonization and Euler–Maclaurin methods, it derives the leading-term asymptotics of as under broad growth conditions on the two sequences, showing that both distributions contribute to the limit via a key integral expression , with and . The work extends to more subfamilies and discusses rising moments, providing concrete examples (e.g., Zipf vs exponential) and a table of leading terms, complemented by simulations. The findings illuminate how heterogeneity in coupon probabilities shapes the expected time to collect complete sets and generalize classic coupon-collector results to interlaced distributions. The approach and results have potential applications in areas where heterogeneous rare events govern discovery processes.

Abstract

The double Dixie cup problem of D.J. Newman and L. Shepp is a well-known variant of the coupon collector problem, where the object of study is the number of coupons that a collector has to buy in order to complete m sets of all N existing different coupons. In this paper we consider the case where the coupons distribution is a mixture of two different distributions, where the coupons from the first distribution are far rarer than the ones coming from the second. We apply a Poissonization technique, as well as well known results and techniques from our previous work, to derive the asymptotics (leading term) of the expectation of the above random variable as N goes to infinity for large classes of distributions. As it turns out, both distributions contribute to this result. The leading asymptotics of the rising moments of the aforementioned random variable are also discussed. We conclude by generalizing the problem to the case where the family of coupons is a mixture of j subfamilies.

Paper Structure

This paper contains 5 sections, 2 theorems, 56 equations, 1 figure, 1 table.

Key Result

Theorem 1

Let the sequence $\alpha = \{a_{j}\}_{j=1}^{\infty }$ is formed as the interlacing of two subsequences $\beta = \{b_{j}\}_{j=1}^{\infty }$ and $\delta = \{d_{j}\}_{j=1}^{\infty }$, as given in relation (sb1). Then, as $N=2M \rightarrow \infty$ we have where, provided that the coupons from the sequence $\delta$ are far rarer than the ones from the sequence $\beta$ (see, (deca)--(cond3)).

Figures (1)

  • Figure :

Theorems & Definitions (2)

  • Theorem 1
  • Proposition 2