Upper Bounds on the Average Height of Random Binary Trees
Louisa Seelbach Benkner
TL;DR
Addresses the problem of bounding the average height $\mathbb{E}(H_{n,\sigma})$ of random binary trees generated by leaf-centric sources. It introduces two broad source classes, $\psi$-upper-bounded and $\phi$-weakly-balanced, and develops a recursive exponential-moment framework (analyzing $\mathbb{E}(\varphi_n^{H_{n,\sigma}})$ and applying Jensen's inequality) to obtain tight $O(\log n)$ bounds in key models. The results generalize Devroye's $O(\log n)$ bound for binary search trees and yield $O(\log n)$ for the binomial random tree model, while giving a weaker $O(\sqrt{n}\log^2 n)$ bound for the uniform distribution. The paper also discusses limitations for the uniform case and open questions, such as identifying a strongly-balanced leaf-centric class that contains the uniform model and deriving lower bounds or extensions to fixed-size ordinal tree sources.
Abstract
We study the average height of random trees generated by leaf-centric binary tree sources as introduced by Zhang, Yang and Kieffer. A leaf-centric binary tree source induces for every $n \geq 2$ a probability distribution on the set of binary trees with $n$ leaves. Our results generalize a result by Devroye, according to which the average height of a random binary search tree of size $n$ is in $\mathcal{O}(\log n)$.
