Table of Contents
Fetching ...

Robust Parameter Fitting to Realistic Network Models via Iterative Stochastic Approximation

Thomas Bläsius, Sarel Cohen, Philipp Fischbeck, Tobias Friedrich, Martin S. Krejca

TL;DR

The paper tackles the challenge of selecting parameters for random graph models to reproduce observed network features, especially after reducing to the largest connected component. It introduces ParFit, an anytime parameter-fitting method based on the Robbins-Monro stochastic approximation with iterate averaging, which updates parameters using just a single network sample per iteration. Across Erdős–Rényi, Chung–Lu, and GIRG models, plus 35 real networks, ParFit achieves high feature fidelity (large $E$-based correlations and small MAEs) with relatively few iterations, enabling efficient and robust parameter recovery. The approach improves practical applicability of synthetic network generation for benchmarking and analysis and opens avenues for exploring higher-dimensional geometries and alternative feature mappings.

Abstract

Random graph models are widely used to understand network properties and graph algorithms. Key to such analyses are the different parameters of each model, which affect various network features, such as its size, clustering, or degree distribution. The exact effect of the parameters on these features is not well understood, mainly because we lack tools to thoroughly investigate this relation. Moreover, the parameters cannot be considered in isolation, as changing one affects multiple features. Existing approaches for finding the best model parameters of desired features, such as a grid search or estimating the parameter-feature relations, are not well suited, as they are inaccurate or computationally expensive. We introduce an efficient iterative fitting method, named ParFit, that finds parameters using only a few network samples, based on the Robbins-Monro algorithm. We test ParFit on three well-known graph models, namely Erdős-Rényi, Chung-Lu, and geometric inhomogeneous random graphs, as well as on real-world networks, including web networks. We find that ParFit performs well in terms of quality and running time across most parameter configurations.

Robust Parameter Fitting to Realistic Network Models via Iterative Stochastic Approximation

TL;DR

The paper tackles the challenge of selecting parameters for random graph models to reproduce observed network features, especially after reducing to the largest connected component. It introduces ParFit, an anytime parameter-fitting method based on the Robbins-Monro stochastic approximation with iterate averaging, which updates parameters using just a single network sample per iteration. Across Erdős–Rényi, Chung–Lu, and GIRG models, plus 35 real networks, ParFit achieves high feature fidelity (large -based correlations and small MAEs) with relatively few iterations, enabling efficient and robust parameter recovery. The approach improves practical applicability of synthetic network generation for benchmarking and analysis and opens avenues for exploring higher-dimensional geometries and alternative feature mappings.

Abstract

Random graph models are widely used to understand network properties and graph algorithms. Key to such analyses are the different parameters of each model, which affect various network features, such as its size, clustering, or degree distribution. The exact effect of the parameters on these features is not well understood, mainly because we lack tools to thoroughly investigate this relation. Moreover, the parameters cannot be considered in isolation, as changing one affects multiple features. Existing approaches for finding the best model parameters of desired features, such as a grid search or estimating the parameter-feature relations, are not well suited, as they are inaccurate or computationally expensive. We introduce an efficient iterative fitting method, named ParFit, that finds parameters using only a few network samples, based on the Robbins-Monro algorithm. We test ParFit on three well-known graph models, namely Erdős-Rényi, Chung-Lu, and geometric inhomogeneous random graphs, as well as on real-world networks, including web networks. We find that ParFit performs well in terms of quality and running time across most parameter configurations.
Paper Structure (23 sections, 2 equations, 5 figures, 8 tables, 1 algorithm)

This paper contains 23 sections, 2 equations, 5 figures, 8 tables, 1 algorithm.

Figures (5)

  • Figure 1: ParFit (Algorithm \ref{['alg:fitting-method']}) evaluated on several scenarios of geometric inhomogeneous random graphs (GIRG; see also Section \ref{['sec:network_models']}) instances. We sample 50 GIRG instances from a parameter configuration, take their mean corresponding features, run ParFit, and take 50 samples based on the fitted parameters. For the four relevant features of GIRG instances, the plots show the target feature value (given to ParFit; $x$-axis) as well as the mean actual feature value of 50 samples based on the fitted parameters ($y$-axis). The color shows the number of iterations (i.e., number of samples) taken by ParFit. A darker color indicates fewer iterations. The diagonal line shows the identity function. Please refer to Section \ref{['sec:randomGraphsResults']} for a discussion.
  • Figure 2: The power law exponents (PLEs) of all real-world networks ($x$-axis) versus the respective PLEs fitted by ParFit (Algorithm \ref{['alg:fitting-method']}, $y$-axis) assuming a GIRG model (Section \ref{['sec:network_models']}). The $x$-axis shows the range of the three PLE estimators in voitalov2019scale. Red lines refer to strong power laws, blue to the rest. The diagonal is the identity function. See Section \ref{['sec:applications']} for a discussion.
  • Figure 3: ParFit (Algorithm \ref{['alg:fitting-method']}) evaluated on several Erdős--Rényi scenarios. We sample 50 Erdős--Rényi instances from a parameter configuration, take their mean corresponding features, run the parameter fitting algorithm, and take 50 samples based on the fitted parameters. For the two relevant features of Erdős--Rényi instances, the plots show the target feature value (given to ParFit; $x$-axis) as well as the mean actual feature value of 50 samples based on the fitted parameters ($y$-axis). The color shows the number of iterations (i.e., number of samples) taken by ParFit. A darker color indicates fewer iterations. The diagonal line shows the identity function.
  • Figure 4: ParFit (Algorithm \ref{['alg:fitting-method']}) evaluated on several Chung--Lu scenarios. We sample 50 Chung--Lu instances from a parameter configuration, take their mean feature values, run the parameter fitting algorithm, and take 50 samples based on the fitted parameters. For the three relevant features of Chung--Lu instances, the plots show the target feature value (given to ParFit; $x$-axis) as well as the mean actual feature value of 50 samples based on the fitted parameters ($y$-axis). The color shows the number of iterations (i.e., number of samples) taken by ParFit. A darker color indicates fewer iterations. The diagonal line shows the identity function.
  • Figure 5: For every considered real-world network, we run the parameter fitting algorithm for the GIRG model on it, and take 50 samples based on the fitted parameters. The plot shows the true measured heterogeneity/clustering of the real-world network versus the mean heterogeneity/clustering of 50 samples based on the fitted parameters. The color shows the number of iterations (i.e., number of samples) taken by the fitting method. The diagonal line shows an identity function.