Table of Contents
Fetching ...

Heuristically Adaptive Diffusion-Model Evolutionary Strategy

Benedikt Hartl, Yanbo Zhang, Hananel Hazan, Michael Levin

TL;DR

This framework marks a major heuristic and algorithmic transition, offering increased flexibility, precision, and control in evolutionary optimization processes, and elevates evolutionary algorithms from procedures with shallow heuristics to frameworks with deep memory.

Abstract

Diffusion Models represent a significant advancement in generative modeling, employing a dual-phase process that first degrades domain-specific information via Gaussian noise and restores it through a trainable model. This framework enables pure noise-to-data generation and modular reconstruction of, images or videos. Concurrently, evolutionary algorithms employ optimization methods inspired by biological principles to refine sets of numerical parameters encoding potential solutions to rugged objective functions. Our research reveals a fundamental connection between diffusion models and evolutionary algorithms through their shared underlying generative mechanisms: both methods generate high-quality samples via iterative refinement on random initial distributions. By employing deep learning-based diffusion models as generative models across diverse evolutionary tasks and iteratively refining diffusion models with heuristically acquired databases, we can iteratively sample potentially better-adapted offspring parameters, integrating them into successive generations of the diffusion model. This approach achieves efficient convergence toward high-fitness parameters while maintaining explorative diversity. Diffusion models introduce enhanced memory capabilities into evolutionary algorithms, retaining historical information across generations and leveraging subtle data correlations to generate refined samples. We elevate evolutionary algorithms from procedures with shallow heuristics to frameworks with deep memory. By deploying classifier-free guidance for conditional sampling at the parameter level, we achieve precise control over evolutionary search dynamics to further specific genotypical, phenotypical, or population-wide traits. Our framework marks a major heuristic and algorithmic transition, offering increased flexibility, precision, and control in evolutionary optimization processes.

Heuristically Adaptive Diffusion-Model Evolutionary Strategy

TL;DR

This framework marks a major heuristic and algorithmic transition, offering increased flexibility, precision, and control in evolutionary optimization processes, and elevates evolutionary algorithms from procedures with shallow heuristics to frameworks with deep memory.

Abstract

Diffusion Models represent a significant advancement in generative modeling, employing a dual-phase process that first degrades domain-specific information via Gaussian noise and restores it through a trainable model. This framework enables pure noise-to-data generation and modular reconstruction of, images or videos. Concurrently, evolutionary algorithms employ optimization methods inspired by biological principles to refine sets of numerical parameters encoding potential solutions to rugged objective functions. Our research reveals a fundamental connection between diffusion models and evolutionary algorithms through their shared underlying generative mechanisms: both methods generate high-quality samples via iterative refinement on random initial distributions. By employing deep learning-based diffusion models as generative models across diverse evolutionary tasks and iteratively refining diffusion models with heuristically acquired databases, we can iteratively sample potentially better-adapted offspring parameters, integrating them into successive generations of the diffusion model. This approach achieves efficient convergence toward high-fitness parameters while maintaining explorative diversity. Diffusion models introduce enhanced memory capabilities into evolutionary algorithms, retaining historical information across generations and leveraging subtle data correlations to generate refined samples. We elevate evolutionary algorithms from procedures with shallow heuristics to frameworks with deep memory. By deploying classifier-free guidance for conditional sampling at the parameter level, we achieve precise control over evolutionary search dynamics to further specific genotypical, phenotypical, or population-wide traits. Our framework marks a major heuristic and algorithmic transition, offering increased flexibility, precision, and control in evolutionary optimization processes.

Paper Structure

This paper contains 26 sections, 9 equations, 8 figures, 1 algorithm.

Figures (8)

  • Figure 1: (A) A schematic flow-chart of a typical evolutionary algorithm (gray arrows and labels) contrasted with our diffusion model (DM)-based evolutionary optimization(golden arrows and symbols, see also \ref{['algo:methods:charles']}), showing an evolutionary process either utilizing population-based (gray) or an ANN-based DM (golden) as heuristically refined generative model for offspring-genotype sampling. The DM-based EA's generative model learns from heuristic experience by training on an epigenetic joint dataset of genome, fitness, and (potentially) conditional feature data of a particular genotype in its environment. We then utilize the successively refined DM to sample high-quality (high fitness) offspring candidate solutions for a particular environment; via classifier-free-guidance techniques ho2022classifier, this generative process can potentially be biased towards desired target traits in the environment on the phenotypic level. (B) Schematics of DM-based evolutionary optimization in an environment with two Gaussian optima at $\bm\mu_\pm=(\pm1,\pm1)$, but conditioning the search dynamics either to a target parameter range $x,y>0$ (red) or $x,y<0$ (blue), or alternate between the two peaks through dynamic conditioning (green). (C) Schematics of utilizing conditional DM evolution of high-fitness genotypes (low to high fitness color-coded from black through orange to white) that maintains diversity (spread in parameter space). (D) Schematic behavior of DM-based conditionally evolved reinforcement learning (RL) sutton1998reinforcement agents deployed in the cart-pole environment barto1983neuronlike; the agents are evolved to maximize fitness (balance the pole), but conditionally sampled to steer the cart to a certain location (here A, B, or C) without changing the reward signal.
  • Figure 2: Workflow of the CHARLES-D algorithm. Starting with a randomly initialized population (red circles), their fitness and features (shown in rounded blue rectangles) are evaluated. Next, a generative model $\mathcal{G}$ (yellow rectangles) is trained on this population, weighted by their fitness, with features used as conditioning for generation. Following training, externaltarget conditions are provided to generate a new population that meets specified requirements. The evaluation-training-generation loop is then repeated. A buffer is maintained to store the population along with their fitness and features for training, enabling full data utilization and preserving population diversity.
  • Figure 3: HADES adapts to dynamic (oscillatory) environmental changes. (A) Dynamically alternating double-peak fitness landscape ranging from $f_{\min}=-1$ (black) through $f_\mathrm{0}=0]$ to $f_{\max}=1$ (white) as defined by \ref{['eq:doublepeak:dynamic']}. (B) Population data for HADES (blue) and CMA-ES hansen2001completely (red) optimization in the dynamically changing environment illustrated in (A). The 2D data points ${\bm{g}}_i=(x_i, y_i)$ are represented as 1D projections $\hat{{\bm{g}}}_i$ onto the diagonal illustrated as dashed lines in (A); the background color indicates the fitness score along $x=y$, and the radius of the data-points $\hat{{\bm{g}}}_i$ scales with fitness $f_i$, respectively. (c) Fitness of the data shown in (B): the solid line illustrates the maximum fitness evaluation of the population averaged over 10 statistically independent simulations; the shaded area illustrates the average spread of the population's maximum fitness. While HADES reliably identifies the current maximum in the alternating double-peak environment, CMA-ES clearly struggles to adapt a population to the changing environment in time as the majority of the population resides in the vicinity of one peak.
  • Figure 4: Conditional evolutionary optimization to explore selected target parameter regions in two-dimensional double-peak fitness landscape. (A, C, E) Fitness landscape (grayscale) and distribution of population data (projected onto the $x=y$ line) for 10 statistically independent simulations vs. generations as violin plot, while conditioning the generative DM to sample novel data points from the first quadrant $x,y>0$ (red), second and forth quadrants $x\times y <0$ (yellow), and third quadrant $x,y<0$ (blue); datapoints are projected onto the $x=y$ diagonal. (B, D, R) Fitness landscape in gray-scale from $f_{\min}=0$ (gray) to $f_{\max}=1$ (white) with overlaid data-points (colored dots) of an exemplary population from panels (A, C, E), respectively, after 9 generations; the target quadrants are illustrated as color-shaded areas.
  • Figure 5: Dynamically Conditioning Genetic Parameters. (A) A static double-peak fitness landscape, ranging from $f_{\min} = 0$ (gray) to $f_{\max} = 1$ (white) as defined by \ref{['eq:doublepeak:dynamic']} with $\omega=\phi=0$. Dynamical conditioning allows exploration of the first quadrant (red) or the third quadrant (blue), see also \ref{['fig:results:dynamic:env']}. (B) The fitness landscape (grayscale) and the distribution of population data (projected onto the $x=y$ line) for 10 statistically independent simulations vs. generations (radii of green-colored data-points scale with fitness). The average population mean is illustrated by the black solid line, while the gray area marks the STD. The oscillating red $\leftrightarrow$ blue color-coding of the fitness landscape reflects the applied condition for the first or third quadrant during DM sampling, respectively leading to jumps of the population from one peak to the other; transition generations are marked by white vertical dashed lines. (C) The mean and STD of maximum fitness (solid green line and shaded area) demonstrate consistently high fitness values, even during transitions of the conditional sampling.
  • ...and 3 more figures