Table of Contents
Fetching ...

Hypernetworks That Evolve Themselves

Joachim Winther Pedersen, Erwan Plantec, Eleni Nisioti, Marcello Barylli, Milton Montero, Kathrin Korte, Sebastian Risi

TL;DR

The paper addresses the limitations of gradient-based optimization by embedding evolutionary dynamics inside neural networks through Self-Referential Graph HyperNetworks (Self-Referential GHNs) that jointly generate and mutate their own weights. It combines a stochastic hypernetwork for variation with a deterministic hypernetwork for task-specific weight generation, enabling derivative-free optimization and rapid adaptation to non-stationary environments. Across CartPole-Switch, LunarLander-Switch, and Ant-v5 benchmarks, the approach yields swift recovery after environmental shifts and emergent control of mutation rates, leading to improved open-ended learning. This work advances toward autonomous agents whose evolvability itself evolves, bridging artificial evolution with biological concepts and offering a path to self-sustaining, open-ended learning systems.

Abstract

How can neural networks evolve themselves without relying on external optimizers? We propose Self-Referential Graph HyperNetworks, systems where the very machinery of variation and inheritance is embedded within the network. By uniting hypernetworks, stochastic parameter generation, and graph-based representations, Self-Referential GHNs mutate and evaluate themselves while adapting mutation rates as selectable traits. Through new reinforcement learning benchmarks with environmental shifts (CartPoleSwitch, LunarLander-Switch), Self-Referential GHNs show swift, reliable adaptation and emergent population dynamics. In the locomotion benchmark Ant-v5, they evolve coherent gaits, showing promising fine-tuning capabilities by autonomously decreasing variation in the population to concentrate around promising solutions. Our findings support the idea that evolvability itself can emerge from neural self-reference. Self-Referential GHNs reflect a step toward synthetic systems that more closely mirror biological evolution, offering tools for autonomous, open-ended learning agents.

Hypernetworks That Evolve Themselves

TL;DR

The paper addresses the limitations of gradient-based optimization by embedding evolutionary dynamics inside neural networks through Self-Referential Graph HyperNetworks (Self-Referential GHNs) that jointly generate and mutate their own weights. It combines a stochastic hypernetwork for variation with a deterministic hypernetwork for task-specific weight generation, enabling derivative-free optimization and rapid adaptation to non-stationary environments. Across CartPole-Switch, LunarLander-Switch, and Ant-v5 benchmarks, the approach yields swift recovery after environmental shifts and emergent control of mutation rates, leading to improved open-ended learning. This work advances toward autonomous agents whose evolvability itself evolves, bridging artificial evolution with biological concepts and offering a path to self-sustaining, open-ended learning systems.

Abstract

How can neural networks evolve themselves without relying on external optimizers? We propose Self-Referential Graph HyperNetworks, systems where the very machinery of variation and inheritance is embedded within the network. By uniting hypernetworks, stochastic parameter generation, and graph-based representations, Self-Referential GHNs mutate and evaluate themselves while adapting mutation rates as selectable traits. Through new reinforcement learning benchmarks with environmental shifts (CartPoleSwitch, LunarLander-Switch), Self-Referential GHNs show swift, reliable adaptation and emergent population dynamics. In the locomotion benchmark Ant-v5, they evolve coherent gaits, showing promising fine-tuning capabilities by autonomously decreasing variation in the population to concentrate around promising solutions. Our findings support the idea that evolvability itself can emerge from neural self-reference. Self-Referential GHNs reflect a step toward synthetic systems that more closely mirror biological evolution, offering tools for autonomous, open-ended learning agents.

Paper Structure

This paper contains 17 sections, 8 figures.

Figures (8)

  • Figure 1: Self-Referential Graph HyperNetwork. A Graph Hypernetwork produces parameters for another network by considering its computational graph. The GHN learns embeddings for all the node types in the target network and updates these in its graph neural network (GNN) module. These embeddings are then passed on to a hypernetwork module that generates parameters for each node. Self-Referential GHNs have two hypernetwork models: a stochastic hypernetwork that is used to produce updates to copies of the GHN itself, and a deterministic hypernetwork that is used to generate parameters for a target network, in our case, a policy network for reinforcement learning environments. Through this combination, the approach can adapt rapidly to the task at hand. In the figure, for brevity, we depict the GNN, the deterministic, and the stochastic hypernetwork as single nodes, although they each consist of multiple layers that in the experiments are represented as separate nodes in the computational graph.
  • Figure 2: 2D switching task. (A) Environment 1: Surface with tiles for navigation (brighter = higher fitness). Points mark where every evaluated policy ended in this phase, with the darker points corresponding to individuals in later generations. Midway through evolution, the landscape switches to Environment 2. (B) Surface with endpoints for all individuals that were evaluated after the shift. (C) Chronological Family Tree: Every dot is an individual; layout follows the order individuals were born. Hue is the individual’s absolute fitness in [0,1]. The blue and green path traces show the lineages of the champions in Environment 0 and 1, respectively. The red divider marks the environment switch. (D) Genealogical Family Tree: Compressed lineage showing only individuals that produced offspring; the horizontal position encodes genealogical generation (parent depth). Colors and champion paths are as in (C). (E) Behavioral Innovations: Ancestor-by-ancestor rollouts for each champion. Each tile shows the actual trajectory (green) taken by that ancestor on the appropriate surfaces, with arrows indicating chronological order. The champions share a common ancestor, but the lineage is split before the environmental shift. These results illustrate the ability of a population of Self-Referential GHNs to adapt to non-stationary tasks.
  • Figure 3: Environments Right: CartPole used for the CartPole-Switch task. Middle: LunarLander used for the LunarLander-Switch task. Right: Ant-v5.
  • Figure 4: CartPole-Switch: Self-Referential GHNs performance in the CartPole-Switch environment. The curves for ten different evolution runs with different seeds are shown, with the mean curve overlaid (opaque blue lines). In all ten runs, the Self-Referential GHNs recover the performance after both switches, such that the best-performing individuals in the population score the highest possible fitness.
  • Figure 5: CartPole-Switch: Performance of evolution algorithms used for comparison in the CartPole-Switch environment. None of the algorithms can consistently recover their performance after both switches. The GESMR algorithm produces populations with large amounts of variation compared to all other algorithms. However, even though the GESMR algorithm can introduce large amounts of variation closely after the environmental changes, this algorithm also fails at recovering performance.
  • ...and 3 more figures