A simple proof of the formula of Solov'ev--Nielsen--Blom for the expected waiting time
Yuuya Yoshida
TL;DR
The paper addresses computing the expected waiting time $E(w)$ until a finite pattern $w$ appears in an i.i.d. sequence over an alphabet with distribution $p$. It presents a simple, purely combinatorial proof of the Solov'ev–Nielsen–Blom formula $E(w)=E(\hat{w})+p(w)^{-1}$, where $\hat{w}$ is the longest bifix of $w$, and builds the proof on precise hitting-time definitions ($T_w$), recursion via $E(w|w')$, and bifix structure. Additional identities are derived, including $\sum_{y\in\mathcal X} p(y)E(wy)=E(w)+1+(r-1)p(w)^{-1}$ and $\sum_{|w|=n} p(w)E(w)=r^n+n-1$, with F1 and F2 as corollaries. The approach clarifies the combinatorial underpinnings of waiting times for pattern occurrences and yields a transparent inductive argument that complements previous probabilistic proofs.
Abstract
Solov'ev (1966), Nielsen (1973), and Blom (1982) independently showed a formula for the expected waiting time until a given finite pattern first occurs in random data. In this paper, we give a simple and combinatorial proof of the formula.
