Table of Contents
Fetching ...

Complexity and accessibility of random landscapes

Sakshi Pahujani, Joachim Krug

TL;DR

This work analyzes high-dimensional fitness landscapes by contrasting uncorrelated (HoC) models with structured landscapes, deriving exact expressions for peak statistics and accessibility thresholds. It shows that in HoC, the expected number of peaks grows as $\mathbb{E}(N_L)=\dfrac{a^L}{(a-1)L+1}$ while the probability a random genotype is a peak vanishes as $L$ increases, revealing a tension between ruggedness and accessibility. The study then connects epistasis to submodularity via UNE, demonstrating how submodular landscapes guarantee a subset–superset accessibility property (AP) and yield large adaptive basins, with implications for evolutionary dynamics and optimization in complex systems. The results illuminate when accessible paths emerge (direct or indirect) and reveal that structured landscapes can organize accessible paths into expansive basins, offering insights relevant to biology and related fields, including statistical physics and machine learning.

Abstract

These notes introduce probabilistic landscape models defined on high-dimensional discrete sequence spaces. The models are motivated primarily by fitness landscapes in evolutionary biology, but links to statistical physics and computer science are mentioned where appropriate. Elementary and advanced results on the structure of landscapes are described with a focus on features that are relevant to evolutionary searches, such as the number of local maxima and the existence of fitness-monotonic paths. The recent discovery of submodularity as a biologically meaningful property of fitness landscapes and its consequences for their accessibility is discussed in detail.

Complexity and accessibility of random landscapes

TL;DR

This work analyzes high-dimensional fitness landscapes by contrasting uncorrelated (HoC) models with structured landscapes, deriving exact expressions for peak statistics and accessibility thresholds. It shows that in HoC, the expected number of peaks grows as while the probability a random genotype is a peak vanishes as increases, revealing a tension between ruggedness and accessibility. The study then connects epistasis to submodularity via UNE, demonstrating how submodular landscapes guarantee a subset–superset accessibility property (AP) and yield large adaptive basins, with implications for evolutionary dynamics and optimization in complex systems. The results illuminate when accessible paths emerge (direct or indirect) and reveal that structured landscapes can organize accessible paths into expansive basins, offering insights relevant to biology and related fields, including statistical physics and machine learning.

Abstract

These notes introduce probabilistic landscape models defined on high-dimensional discrete sequence spaces. The models are motivated primarily by fitness landscapes in evolutionary biology, but links to statistical physics and computer science are mentioned where appropriate. Elementary and advanced results on the structure of landscapes are described with a focus on features that are relevant to evolutionary searches, such as the number of local maxima and the existence of fitness-monotonic paths. The recent discovery of submodularity as a biologically meaningful property of fitness landscapes and its consequences for their accessibility is discussed in detail.

Paper Structure

This paper contains 18 sections, 41 equations, 4 figures.

Figures (4)

  • Figure 3: Direct and indirect paths connecting the corners $\alpha=000$ and $\omega=111$ of the binary 3-cube. The figure shows a direct path of length $l = d(\alpha, \omega) = 3$ to the left and two indirect paths of length $l = d + 2 = 5$ and $l = d+ 4 = 7$ next to it. The rightmost path visits all nodes and is the longest possible self-avoiding path.
  • Figure 5: Hasse diagram. The diagram represents the nodes of the binary hypercube in three dimensions as elements of the power set $\mathcal{P}\{1, 2, 3\}$. Courtesy of Daniel Oros.
  • Figure 6: Submodular landscapes construction. A submodular landscape constructed by convolution of a linear genotype-phenotype map and a concave phenotype-fitness map. The three individual mutations increase the phenotypic value by different amounts, and the phenotypes of all other genotypes are linear combinations of these effects. Then, a concave function maps the phenotypes to their fitness values. The non-monotonicity of this function leads to a fitness graph with multiple peak (two in this case -- marked in red). The fitness graph in faint grey illustrates the rank ordering of the fitness values.
  • Figure 7: Illustration of the subset-superset accessibility property. Two peak genotypes (shown in dark red and dark blue) in the fitness graph are accessible from all their sub- and supersets (depicted in light red and light blue) as indicated by the arrows.