Table of Contents
Fetching ...

Statistics of leaves in growing random trees

Harrison Hartle, P. L. Krapivsky

TL;DR

This work introduces leaf degree as a natural statistic for sparse growing trees and develops a coherent leaf-based framework for two growth paradigms: random recursive trees (RRTs) and leaf-based preferential attachment (leaf-PA). Using generating functions and age-stratified analyses, the authors derive exact and asymptotic forms for the leaf-degree distribution in RRTs, including a factorial decay $m_\ell$ and a Poisson primordial-leaf-degree law, together with precise results for the distribution and cumulants of the total leaf count $\mathcal{L}_N$. In leaf-PA models, a rich set of tail behaviours emerges: a power-law tail for $0<a<1$, a stretched exponential at the critical $a=1$, and exponential tails with algebraic prefactors for $a>1$, with detailed expressions for the primordial and age-stratified leaf statistics; a conjectured equivalence between leaf-degree and degree-tail exponents in the scale-free regime is proposed and partially validated against degree-based PA with matched parameters. The results provide tractable,Analytics-friendly tools for leaf-based statistics in sparse networks and suggest broad avenues for extending leaf-centric analyses to real data and broader graph models.

Abstract

Leaves, i.e., vertices of degree one, can play a significant role in graph structure, especially in sparsely connected settings in which leaves often constitute the largest fraction of vertices. We consider a leaf-based counterpart of the degree, namely, the leaf degree -- the number of leaves a vertex is connected to -- and the associated leaf degree distribution, analogous to the degree distribution. We determine the leaf degree distribution of random recursive trees (RRTs) and trees grown via a leaf-based preferential attachment mechanism that we introduce. The RRT leaf degree distribution decays factorially, in contrast with its purely geometric degree distribution. In the one-parameter leaf-based growth model, each new vertex attaches to an existing vertex with rate $\ell$ + a, where $\ell$ is the leaf degree of the existing vertex, and a > 0. The leaf degree distribution has a powerlaw tail when 0 < a < 1 and an exponential tail (with algebraic prefactor) for a > 1. The critical case of a = 1 has a leaf degree distribution with stretched exponential tail. We compute a variety of additional characteristics in these models and conjecture asymptotic equivalence of degree and leaf degree powerlaw tail exponent in the scale free regime. We highlight several avenues of possible extension for future studies.

Statistics of leaves in growing random trees

TL;DR

This work introduces leaf degree as a natural statistic for sparse growing trees and develops a coherent leaf-based framework for two growth paradigms: random recursive trees (RRTs) and leaf-based preferential attachment (leaf-PA). Using generating functions and age-stratified analyses, the authors derive exact and asymptotic forms for the leaf-degree distribution in RRTs, including a factorial decay and a Poisson primordial-leaf-degree law, together with precise results for the distribution and cumulants of the total leaf count . In leaf-PA models, a rich set of tail behaviours emerges: a power-law tail for , a stretched exponential at the critical , and exponential tails with algebraic prefactors for , with detailed expressions for the primordial and age-stratified leaf statistics; a conjectured equivalence between leaf-degree and degree-tail exponents in the scale-free regime is proposed and partially validated against degree-based PA with matched parameters. The results provide tractable,Analytics-friendly tools for leaf-based statistics in sparse networks and suggest broad avenues for extending leaf-centric analyses to real data and broader graph models.

Abstract

Leaves, i.e., vertices of degree one, can play a significant role in graph structure, especially in sparsely connected settings in which leaves often constitute the largest fraction of vertices. We consider a leaf-based counterpart of the degree, namely, the leaf degree -- the number of leaves a vertex is connected to -- and the associated leaf degree distribution, analogous to the degree distribution. We determine the leaf degree distribution of random recursive trees (RRTs) and trees grown via a leaf-based preferential attachment mechanism that we introduce. The RRT leaf degree distribution decays factorially, in contrast with its purely geometric degree distribution. In the one-parameter leaf-based growth model, each new vertex attaches to an existing vertex with rate + a, where is the leaf degree of the existing vertex, and a > 0. The leaf degree distribution has a powerlaw tail when 0 < a < 1 and an exponential tail (with algebraic prefactor) for a > 1. The critical case of a = 1 has a leaf degree distribution with stretched exponential tail. We compute a variety of additional characteristics in these models and conjecture asymptotic equivalence of degree and leaf degree powerlaw tail exponent in the scale free regime. We highlight several avenues of possible extension for future studies.

Paper Structure

This paper contains 34 sections, 163 equations, 15 figures.

Figures (15)

  • Figure 1: A random recursive tree with $N=30$ vertices. The leaf degrees in this tree are $\ell=0,1,2,3$. The number of leaves is $17$ (blue) and the number of protected vertices is $2$ (purple), together constituting $M_0=19$. The numbers of vertices of leaf degree $\ell=1,2,3$ are $M_1=6$ (green), $M_2=4$ (red), and $M_3=1$ (cyan). The ordinary degree values appearing are $k=1,2,3,4,5,6$; the numbers of vertices with these degrees are $(N_1,N_2,N_3,N_4,N_5)=(17,5,4,2,1,1)$.
  • Figure 2: Normality of the number of protected vertices $P=M_0-N_1$ and of the numbers of vertices $M_\ell$ with leaf degree $\ell$ for $\ell=0,1,2$. Data from $1.2\times 10^5$ stochastic realizations of the RRT at size $N=10^5$. The intensive variance parameters of Eq. \ref{['ML-Gauss']} are fitted as $\nu_0\approx 0.3816$, $\nu_1\approx0.1396$, and $\nu_2\approx 0.0570$. The green dots are empirical histogram values, blue dots are a binned histogram, and the black dashed lines are fitted normals.
  • Figure 3: Evolution of leaf degree in growing trees. A vertex of leaf degree $\ell$ (turquoise central vertex, lower panel) may either be directly attached to by the arriving vertex (black), or, if $\ell>0$, one of its leaf neighbors (green) may be attached to. Under direct attachment, its leaf degree increases to $\ell+1$ (upper left panel). If one of its leaves is attached to, that previous leaf becomes a nonleaf (blue), yielding a decrease in leaf degree to $\ell-1$ (upper right panel). In the depicted case, $\ell=3$ becomes $\ell=4$ under direct attachment or $\ell=2$ under a leaf neighbor attachment. The total leaf-count goes from $3$ to $4$ in the former case, and is preserved at $3$ in the latter.
  • Figure 4: Degree and leaf degree distribution in RRTs. Data from $1.2\times 10^5$ stochastic realizations at size $n=10^5$. The degree distribution is $n_k=2^{-k}$ and the leaf degree distribution is given by Eq. \ref{['m-sol']}. Dashed lines represent theoretical curves and dots represent simulation data.
  • Figure 5: Age-stratified distribution of leaf degree at several intermediate normalized ages $x=j/N$. The solid curve shows the theoretical value (Eq. \ref{['Pi:RRT']}), and the dots are simulation data from $10^4$ RRTs of size $N=10^4$, with approximation at $x$ taken from indices $Nx-\Delta<j<N x+\Delta$ with $\Delta=50$.
  • ...and 10 more figures