Table of Contents
Fetching ...

Improved Bounds with a Simple Algorithm for Edge Estimation for Graphs of Unknown Size

Debarshi Chanda

TL;DR

This work tackles the sublinear-time problem of estimating a graph’s average degree $d$ without knowing the graph size $n$ or arboricity $\alpha$. It presents a simple, parameter-oblivious algorithm that uses only Degree and RandEdge queries to obtain a $(1\pm\varepsilon)$-multiplicative estimate of $d$, achieving $\widetilde{O}\left(\frac{\alpha}{\varepsilon^{2}d}\right)$ Degree queries and $\widetilde{O}\left(\frac{1}{\varepsilon^{2}}\right)$ RandEdge queries in expectation, improving prior bounds. The paper also proves lower bounds across multiple query models, showing that any algorithm must incur at least $\Omega\left(\min\left(d, \frac{\alpha}{d}\right)\right)$ queries under Degree, Neighbour, RandEdge (and extensions with Pair or FullNbr), thereby tightly characterizing the complexity landscape for edge estimation with unknown graph size. The results demonstrate that substantial improvements can be achieved by exploiting graph structure (arboricity) rather than relying on additional access to Neighbour or FullNbr queries, and they address open questions raised by prior work, with practical implications for simple, scalable graph analysis in unknown-size settings.

Abstract

We propose a randomized algorithm with query access that given a graph $G$ with arboricity $α$, and average degree $d$, makes $\widetilde{O}\left(\fracα{\varepsilon^2d}\right)$ \texttt{Degree} and $\widetilde{O}\left(\frac{1}{\varepsilon^2}\right)$ \texttt{Random Edge} queries to obtain an estimate $\widehat{d}$ satisfying $\widehat{d} \in (1\pm\varepsilon)d$. This improves the $\widetilde{O}_{\varepsilon,\log n}\left(\sqrt{\frac{n}{d}}\right)$ query algorithm of [Beretta et al., SODA 2026] that has access to \texttt{Degree}, \texttt{Neighbour}, and \texttt{Random Edge} queries. Our algorithm does not require any graph parameter as input, not even the size of the vertex set, and attains both simplicity and practicality through a new estimation technique. We complement our upper bounds with a lower bound that shows for all valid $n,d$, and $α$, any algorithm that has access to \texttt{Degree}, \texttt{Neighbour}, and \texttt{Random Edge} queries, must make at least $Ω\left(\min\left(d,\fracα{d}\right)\right)$ queries to obtain a $(1\pm\varepsilon)$-multiplicative estimate of $d$, even with the knowledge of $n$ and $α$. We also show that even with \texttt{Pair} and \texttt{FullNbr} queries, an algorithm must make $Ω\left(\min\left(d,\fracα{d}\right)\right)$ queries to obtain a $(1\pm\varepsilon)$-multiplicative estimate of $d$. Our work addresses both the questions raised by the work of [Beretta et al., SODA 2026].

Improved Bounds with a Simple Algorithm for Edge Estimation for Graphs of Unknown Size

TL;DR

This work tackles the sublinear-time problem of estimating a graph’s average degree without knowing the graph size or arboricity . It presents a simple, parameter-oblivious algorithm that uses only Degree and RandEdge queries to obtain a -multiplicative estimate of , achieving Degree queries and RandEdge queries in expectation, improving prior bounds. The paper also proves lower bounds across multiple query models, showing that any algorithm must incur at least queries under Degree, Neighbour, RandEdge (and extensions with Pair or FullNbr), thereby tightly characterizing the complexity landscape for edge estimation with unknown graph size. The results demonstrate that substantial improvements can be achieved by exploiting graph structure (arboricity) rather than relying on additional access to Neighbour or FullNbr queries, and they address open questions raised by prior work, with practical implications for simple, scalable graph analysis in unknown-size settings.

Abstract

We propose a randomized algorithm with query access that given a graph with arboricity , and average degree , makes \texttt{Degree} and \texttt{Random Edge} queries to obtain an estimate satisfying . This improves the query algorithm of [Beretta et al., SODA 2026] that has access to \texttt{Degree}, \texttt{Neighbour}, and \texttt{Random Edge} queries. Our algorithm does not require any graph parameter as input, not even the size of the vertex set, and attains both simplicity and practicality through a new estimation technique. We complement our upper bounds with a lower bound that shows for all valid , and , any algorithm that has access to \texttt{Degree}, \texttt{Neighbour}, and \texttt{Random Edge} queries, must make at least queries to obtain a -multiplicative estimate of , even with the knowledge of and . We also show that even with \texttt{Pair} and \texttt{FullNbr} queries, an algorithm must make queries to obtain a -multiplicative estimate of . Our work addresses both the questions raised by the work of [Beretta et al., SODA 2026].

Paper Structure

This paper contains 22 sections, 25 theorems, 20 equations, 2 figures, 1 table, 5 algorithms.

Key Result

Theorem 1

There exists a randomized algorithm that, given access to Degree and RandEdge queries to a graph $G$ with arboricity $\alpha$, and average degree $d_{} = \Omega\left({{1}}\right)$, outputs an estimate $\widehat{d_{}}$ such that $\widehat{d_{}} \in (1\pm\varepsilon)d_{}$ with probability at least $\f

Figures (2)

  • Figure 1: An overall picture of the main ideas for the algorithm.
  • Figure 2: The construction for the Lower Bounds (\ref{['Thm: Lower Bound - High Degree']} and \ref{['Thm: Lower Bound - Low Degree']})

Theorems & Definitions (39)

  • Theorem 1: Upper Bound - General Graphs
  • Theorem 2: Upper Bound - No Isolated Vertices
  • Theorem 3: Lower Bound
  • Definition 4: Heavy and Light Vertices - $(\tau)$
  • Definition 5: Good Threshold
  • Definition 6: Arboricity$(\alpha)$
  • Lemma 7: Nash-Williams Theorem Nash-WilliamsArboricityTheorem
  • Corollary 8: Arboricity based bound on $m$
  • Corollary 9
  • Lemma 10
  • ...and 29 more