Improved Bounds with a Simple Algorithm for Edge Estimation for Graphs of Unknown Size
Debarshi Chanda
TL;DR
This work tackles the sublinear-time problem of estimating a graph’s average degree $d$ without knowing the graph size $n$ or arboricity $\alpha$. It presents a simple, parameter-oblivious algorithm that uses only Degree and RandEdge queries to obtain a $(1\pm\varepsilon)$-multiplicative estimate of $d$, achieving $\widetilde{O}\left(\frac{\alpha}{\varepsilon^{2}d}\right)$ Degree queries and $\widetilde{O}\left(\frac{1}{\varepsilon^{2}}\right)$ RandEdge queries in expectation, improving prior bounds. The paper also proves lower bounds across multiple query models, showing that any algorithm must incur at least $\Omega\left(\min\left(d, \frac{\alpha}{d}\right)\right)$ queries under Degree, Neighbour, RandEdge (and extensions with Pair or FullNbr), thereby tightly characterizing the complexity landscape for edge estimation with unknown graph size. The results demonstrate that substantial improvements can be achieved by exploiting graph structure (arboricity) rather than relying on additional access to Neighbour or FullNbr queries, and they address open questions raised by prior work, with practical implications for simple, scalable graph analysis in unknown-size settings.
Abstract
We propose a randomized algorithm with query access that given a graph $G$ with arboricity $α$, and average degree $d$, makes $\widetilde{O}\left(\fracα{\varepsilon^2d}\right)$ \texttt{Degree} and $\widetilde{O}\left(\frac{1}{\varepsilon^2}\right)$ \texttt{Random Edge} queries to obtain an estimate $\widehat{d}$ satisfying $\widehat{d} \in (1\pm\varepsilon)d$. This improves the $\widetilde{O}_{\varepsilon,\log n}\left(\sqrt{\frac{n}{d}}\right)$ query algorithm of [Beretta et al., SODA 2026] that has access to \texttt{Degree}, \texttt{Neighbour}, and \texttt{Random Edge} queries. Our algorithm does not require any graph parameter as input, not even the size of the vertex set, and attains both simplicity and practicality through a new estimation technique. We complement our upper bounds with a lower bound that shows for all valid $n,d$, and $α$, any algorithm that has access to \texttt{Degree}, \texttt{Neighbour}, and \texttt{Random Edge} queries, must make at least $Ω\left(\min\left(d,\fracα{d}\right)\right)$ queries to obtain a $(1\pm\varepsilon)$-multiplicative estimate of $d$, even with the knowledge of $n$ and $α$. We also show that even with \texttt{Pair} and \texttt{FullNbr} queries, an algorithm must make $Ω\left(\min\left(d,\fracα{d}\right)\right)$ queries to obtain a $(1\pm\varepsilon)$-multiplicative estimate of $d$. Our work addresses both the questions raised by the work of [Beretta et al., SODA 2026].
