Table of Contents
Fetching ...

Evolutionary Generation of Random Surreal Numbers for Benchmarking

Matthew Roughan

TL;DR

This work introduces an evolutionary synthesis method to generate random ensembles of surreal numbers with controlled complexity, enabling robust benchmarking of recursive algorithms and network-like data. By treating surreal numbers as DAGs and using a clade-based evolution with a Poisson-number of parents and a generation-weighting scheme, the authors derive a new two-parameter distribution Su$(\lambda, \alpha)$ governing generation, with a closed-form CDF $P(g(x)\le k)=e^{-\lambda \alpha^{k}}$ and PMF $P(g(x)=0)=e^{-\lambda}$, $P(g(x)=k)=e^{-\lambda \alpha^{k}} - e^{-\lambda \alpha^{k-1}}$ for $k\ge1$. Empirical results show convergence of generation and graph statistics, delineate the final ensemble’s structure (roughly geometric tails, quadratic growth of nodes with generation, linear nodes-to-edges relation), and reveal how the split-point distribution shapes integer prevalence via Simon’s Extra Option Theorem. The approach yields a practical benchmark data generator for surreal-number computations and broader DAG-like data, with open-source code and clear avenues for extending the synthesis to other constrained networks.

Abstract

There are many areas of scientific endeavour where large, complex datasets are needed for benchmarking. Evolutionary computing provides a means towards creating such sets. As a case study, we consider Conway's Surreal numbers. They have largely been treated as a theoretical construct, with little effort towards empirical study, at least in part because of the difficulty of working with all but the smallest numbers. To advance this status, we need efficient algorithms, and in order to develop such we need benchmark data sets of surreal numbers. In this paper, we present a method for generating ensembles of random surreal numbers to benchmark algorithms. The approach uses an evolutionary algorithm to create the benchmark datasets where we can analyse and control features of the resulting test sets. Ultimately, the process is designed to generate networks with defined properties, and we expect this to be useful for other types of network data.

Evolutionary Generation of Random Surreal Numbers for Benchmarking

TL;DR

This work introduces an evolutionary synthesis method to generate random ensembles of surreal numbers with controlled complexity, enabling robust benchmarking of recursive algorithms and network-like data. By treating surreal numbers as DAGs and using a clade-based evolution with a Poisson-number of parents and a generation-weighting scheme, the authors derive a new two-parameter distribution Su governing generation, with a closed-form CDF and PMF , for . Empirical results show convergence of generation and graph statistics, delineate the final ensemble’s structure (roughly geometric tails, quadratic growth of nodes with generation, linear nodes-to-edges relation), and reveal how the split-point distribution shapes integer prevalence via Simon’s Extra Option Theorem. The approach yields a practical benchmark data generator for surreal-number computations and broader DAG-like data, with open-source code and clear avenues for extending the synthesis to other constrained networks.

Abstract

There are many areas of scientific endeavour where large, complex datasets are needed for benchmarking. Evolutionary computing provides a means towards creating such sets. As a case study, we consider Conway's Surreal numbers. They have largely been treated as a theoretical construct, with little effort towards empirical study, at least in part because of the difficulty of working with all but the smallest numbers. To advance this status, we need efficient algorithms, and in order to develop such we need benchmark data sets of surreal numbers. In this paper, we present a method for generating ensembles of random surreal numbers to benchmark algorithms. The approach uses an evolutionary algorithm to create the benchmark datasets where we can analyse and control features of the resulting test sets. Ultimately, the process is designed to generate networks with defined properties, and we expect this to be useful for other types of network data.

Paper Structure

This paper contains 13 sections, 5 theorems, 16 equations, 8 figures, 1 algorithm.

Key Result

lemma 1

Given the process described in alg:main, the expected proportion of surreal forms with generation number 0 is

Figures (8)

  • Figure 1: A DAG depicting a surreal form of 3/2. Boxes represent a surreal number (value in the top section, and the left and right sets shown in the bottom sections), with left and right parents shown by red and blue arrows.
  • Figure 2: The predicted PMF of the generation distribution showing the empirical distributions (derived from 30 simulations, iterated through 50 clades, with population size $n=500$ and $g^{(0)}_{max} = 1$, $\alpha=0.8, \lambda=3.5$) and the predicted surreal distribution. The geometric approximation is also shown (dashed line), and the empirical distributions are shown for both splitting functions (Uniform and Binomial), though there is no significant difference.
  • Figure 3: The mean of average generation, and number of nodes and edges for elements of a sequence of clades (for population size $n=4000$ and $g^{(0)}_{max} = 1$, $\alpha=0.8, \lambda=3.5$). Dotted vertical lines show the iteration at which the value first reaches 99% of the eventual mean.
  • Figure 4: Convergence WRT population size $n$ ($\alpha=0.8, \lambda=3.5$) showing node-size converges to different values.
  • Figure 5: Convergence time, i.e., until variables converge to within 1% of their final value ($n=4000$).
  • ...and 3 more figures

Theorems & Definitions (13)

  • definition 1
  • definition 2
  • Remark
  • Remark
  • lemma 1
  • proof
  • theorem 1
  • proof
  • theorem 2
  • proof
  • ...and 3 more