Estimating the history of a random recursive tree

Simon Briend; Christophe Giraud; Gábor Lugosi; Déborah Sulem

Estimating the history of a random recursive tree

Simon Briend, Christophe Giraud, Gábor Lugosi, Déborah Sulem

TL;DR

The paper addresses the problem of reconstructing the entire arrival history in random recursive trees under uniform attachment and preferential attachment. It introduces a Jordan centrality-based ordering estimator and a family of risk measures $R_{\alpha}$ to quantify ordering quality, deriving minimax lower bounds and proving near-optimality of the estimator in both models. A descendant-centrality proxy is developed to facilitate analysis, yielding explicit upper bounds and establishing minimax optimality for $\alpha$ in respective ranges: $[1,2)$ for URRT and $[1,3/2)$ for PA, with certain logarithmic factors. Simulations corroborate the theoretical results, showing the Jordan-based approach outperforms degree-based and spectral orderings in practice and scales efficiently. These results advance network archaeology by providing provable guarantees and practical, scalable methods for recovering the temporal order of growth in recursive trees.

Abstract

This paper studies the problem of estimating the order of arrival of the vertices in a random recursive tree. Specifically, we study two fundamental models: the uniform attachment model and the linear preferential attachment model. We propose an order estimator based on the Jordan centrality measure and define a family of risk measures to quantify the quality of the ordering procedure. Moreover, we establish a minimax lower bound for this problem, and prove that the proposed estimator is nearly optimal. Finally, we numerically demonstrate that the proposed estimator outperforms degree-based and spectral ordering procedures.

Estimating the history of a random recursive tree

TL;DR

to quantify ordering quality, deriving minimax lower bounds and proving near-optimality of the estimator in both models. A descendant-centrality proxy is developed to facilitate analysis, yielding explicit upper bounds and establishing minimax optimality for

in respective ranges:

for URRT and

for PA, with certain logarithmic factors. Simulations corroborate the theoretical results, showing the Jordan-based approach outperforms degree-based and spectral orderings in practice and scales efficiently. These results advance network archaeology by providing provable guarantees and practical, scalable methods for recovering the temporal order of growth in recursive trees.

Abstract

Paper Structure (21 sections, 12 theorems, 107 equations, 8 figures)

This paper contains 21 sections, 12 theorems, 107 equations, 8 figures.

Introduction
Related work
Notation
The uniform attachment model
A lower bound
An auxiliary "descendant-ordering" procedure
Performance of Jordan ordering in the URRT model
Preferential attachment tree
A lower bound
Preferential attachment tree
A lower bound
Performance of the Jordan ordering in the PA model
Simulations
Appendix
A remark on the choice of the risk
...and 6 more sections

Key Result

Theorem 1

In the urrt model, we have, for all $\alpha >0$ and $n\geq 200$,

Figures (8)

Figure 1: An illustration of the subtree $(T,u)_v$, corresponding to nodes highlighted in red.
Figure 2: Sketch of a tree and its centroid. Circled in red are the vertices of the path $\{1\to c\}$ (case $1$). Blue vertices correspond to case $2$, green to case $3$ and purple vertices to case $4$.
Figure 3: Risk $R_{\alpha}$ of the descendant ordering versus the tree size $n$ in logarithmic scales, for $\alpha=1$ (left panel) and for $\alpha=1.5$ (right panel), and for trees simulated from the urrt model. Here, we sample $10$ trees for each size, and report a boxplot with the median, first, and last quartiles, for each tree size.
Figure 4: Risk $R_{\alpha}$ of the descendant ordering versus the tree size $n$ in logarithmic scales, for $\alpha=1$ (left panel) and for $\alpha=1.2$ (right panel), and for trees simulated from the pa model. Here, we sample $10$ trees for each size, and report a boxplot with the median, first, and last quartiles, for each tree size.
Figure 5: Risk $R_{\alpha}$ versus the tree size $n$ in logarithmic scales, for $\alpha=1.5$, and for trees simulated from the urrt model. Here, we sample $10$ trees for each size. We compare the risk of descendant (blue), degree (orange), and spectral methods (green), and report a boxplot with the median, first, and last quartiles, for each tree size. In all settings, the descendant ordering largely outperforms the other methods.
...and 3 more figures

Theorems & Definitions (21)

Theorem 1
proof
Lemma 2
proof
Lemma 3
proof
Theorem 4
Corollary 5
Lemma 6
Theorem 7
...and 11 more

Estimating the history of a random recursive tree

TL;DR

Abstract

Estimating the history of a random recursive tree

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (8)

Theorems & Definitions (21)