Table of Contents
Fetching ...

A simple proof of the formula of Solov'ev--Nielsen--Blom for the expected waiting time

Yuuya Yoshida

TL;DR

The paper addresses computing the expected waiting time $E(w)$ until a finite pattern $w$ appears in an i.i.d. sequence over an alphabet with distribution $p$. It presents a simple, purely combinatorial proof of the Solov'ev–Nielsen–Blom formula $E(w)=E(\hat{w})+p(w)^{-1}$, where $\hat{w}$ is the longest bifix of $w$, and builds the proof on precise hitting-time definitions ($T_w$), recursion via $E(w|w')$, and bifix structure. Additional identities are derived, including $\sum_{y\in\mathcal X} p(y)E(wy)=E(w)+1+(r-1)p(w)^{-1}$ and $\sum_{|w|=n} p(w)E(w)=r^n+n-1$, with F1 and F2 as corollaries. The approach clarifies the combinatorial underpinnings of waiting times for pattern occurrences and yields a transparent inductive argument that complements previous probabilistic proofs.

Abstract

Solov'ev (1966), Nielsen (1973), and Blom (1982) independently showed a formula for the expected waiting time until a given finite pattern first occurs in random data. In this paper, we give a simple and combinatorial proof of the formula.

A simple proof of the formula of Solov'ev--Nielsen--Blom for the expected waiting time

TL;DR

The paper addresses computing the expected waiting time until a finite pattern appears in an i.i.d. sequence over an alphabet with distribution . It presents a simple, purely combinatorial proof of the Solov'ev–Nielsen–Blom formula , where is the longest bifix of , and builds the proof on precise hitting-time definitions (), recursion via , and bifix structure. Additional identities are derived, including and , with F1 and F2 as corollaries. The approach clarifies the combinatorial underpinnings of waiting times for pattern occurrences and yields a transparent inductive argument that complements previous probabilistic proofs.

Abstract

Solov'ev (1966), Nielsen (1973), and Blom (1982) independently showed a formula for the expected waiting time until a given finite pattern first occurs in random data. In this paper, we give a simple and combinatorial proof of the formula.

Paper Structure

This paper contains 5 sections, 6 theorems, 29 equations.

Key Result

Theorem 1.1

Let $w$ be a finite word of length $\ge1$, and $\hat{w}$ be the longest bifix of $w$. Then where $p(w)$, $|w|$, and $w(i)$ denote the product $\prod_{i=1}^{|w|} p(w(i))$, the length of $w$, and the $i$th letter of $w$, respectively.

Theorems & Definitions (14)

  • Theorem 1.1: Solov'ev Solov'ev, Nielsen Nielsen, and Blom Blom
  • Lemma 2.1
  • Lemma 2.2
  • Lemma 2.3
  • proof : Proof of Lemma $\ref{['lem3']}$ assuming Lemmas $\ref{['lem1']}$ and $\ref{['lem2']}$
  • Lemma 3.1
  • proof
  • proof : Proof of Theorem $\ref{['thm1']}$
  • proof : Proof of F1
  • proof : Proof of F2
  • ...and 4 more