Evolving Self-Assembling Neural Networks: From Spontaneous Activity to Experience-Dependent Learning

Erwan Plantec; Joachin W. Pedersen; Milton L. Montero; Eleni Nisioti; Sebastian Risi

Evolving Self-Assembling Neural Networks: From Spontaneous Activity to Experience-Dependent Learning

Erwan Plantec, Joachin W. Pedersen, Milton L. Montero, Eleni Nisioti, Sebastian Risi

TL;DR

The paper addresses the limitation of conventional neural networks in lifelong adaptability by introducing Lifelong Neural Developmental Programs (LNDPs), a framework that enables simultaneous synaptic and structural plasticity guided by local activity and global reward. An instantiation based on Graph Transformer layers and GRUs, augmented with a spontaneous-activity pre-development phase, demonstrates self-organizing networks that learn from random or empty initial connections across classic control tasks. Key contributions include the LNDP formalism, a GT+GRU-based implementation for node/edge dynamics, and empirical evidence that structural plasticity and spontaneous activity improve fast adaptation and lifelong learning, especially in non-stationary environments. The results highlight LNDPs as a viable path toward self-organizing, lifetime-adaptive ANNs that better approximate the adaptability observed in biological neural networks, with implications for fields requiring continual learning and robust generalization.

Abstract

Biological neural networks are characterized by their high degree of plasticity, a core property that enables the remarkable adaptability of natural organisms. Importantly, this ability affects both the synaptic strength and the topology of the nervous systems. Artificial neural networks, on the other hand, have been mainly designed as static, fully connected structures that can be notoriously brittle in the face of changing environments and novel inputs. Building on previous works on Neural Developmental Programs (NDPs), we propose a class of self-organizing neural networks capable of synaptic and structural plasticity in an activity and reward-dependent manner which we call Lifelong Neural Developmental Program (LNDP). We present an instance of such a network built on the graph transformer architecture and propose a mechanism for pre-experience plasticity based on the spontaneous activity of sensory neurons. Our results demonstrate the ability of the model to learn from experiences in different control tasks starting from randomly connected or empty networks. We further show that structural plasticity is advantageous in environments necessitating fast adaptation or with non-stationary rewards.

Evolving Self-Assembling Neural Networks: From Spontaneous Activity to Experience-Dependent Learning

TL;DR

Abstract

Paper Structure (18 sections, 8 equations, 5 figures, 1 table)

This paper contains 18 sections, 8 equations, 5 figures, 1 table.

Introduction
Background
Meta-learning Synaptic Plasticity
Developmental Encodings
Lifelong Neural Developmental Programs
Initialization
Nodes
Edges
Structural plasticity
Network dynamics
Implementation
Spontaneous activity
Methods
Environments
Training
...and 3 more sections

Figures (5)

Figure 1: LNDP components. (a) Node model Node features $h^t$, activations $v^t$ as well as additional graph structural features are fed through a single Graph Transformer layer whose output is fed as input to a GRU to obtain new nodes states $h^{t+1}$. (b) Edge model Edges are also modelled with GRUs and take as input pre and post-synaptics nodes' states and the last reward received $r^t$. (c) Network topology Networks are divided in input (blue), hidden (black) and output (red) neurons. Connections can only exist from input to hidden, hidden to hidden, and hidden to output. Some nodes can have no connections at all and the total number of hidden nodes is constant. Hyperparameters $\mu^{conn}$ and $\sigma^{conn}$ define the distribution (truncated normal) of the initial network density.
Figure 2: Training curves of the LNDP with varying initialization distributions (where $\mu^{conn}$ is the mean connection probability and $\sigma^{conn}$ its variance) and structural plasticity (SP) enabled (red) and disabled (blue). For all conditions, structurally plastic LNDPs outperform non structurally plastic ones in Cartpole and Foraging. Models without PS are not evaluated on empty networks ($\mu^{conn}=0$) as they would obviously fail (the network would remain empty).
Figure 3: Structural Plasticity. Changes during the agent lifetime of output nodes' in-degrees in the foraging task. Red and blue lines correspond to left and right options respectively while shaded regions indicate the current location of the reward (red for left and blue for right). We can see that new connections are created towards the current best option while synapses towards the other action are pruned.
Figure 4: LNDPs in CartPole. (a) Training curves showing average reward of the population for each episode separately in CartPole. (b) Distribution of rewards at each episode (from top to bottom) obtained by one evolved LNDP trained on 5 episodes, episodes 5 to 9 (shown in red) are thus out of the training distribution. (c) Evolution of a network through its SA phase. Node colors code for the states $h^t$ and edge colors for weights $w^t$. Inputs are on the left and outputs on the right. (d) 1,000 developmental trajectories of a single evolved LNDP in the weights distribution space formed by the first two statistical moments (mean and variance of the weights). Trajectories correspond to different random initializations and colors code for the agent return (bluer lines correspond to failures and greener to successes), with the $\star$ indicating the starting point. (e) Trajectory of the network activations $v^t$ through time (dimensions are reduced through PCA).
Figure 5: Performance of LNDP and NDP in Foraging and Cartpole environments. The NDP is obtained by ablating the network updates during lifetime (i.e. after the SA phase). Both models go through $100$ SA steps. While the performance of both approaches is more similar in the CartPole domain, the Foraging task requires an agent to adapt during its lifetime, which only the LNDP is capable of.

Evolving Self-Assembling Neural Networks: From Spontaneous Activity to Experience-Dependent Learning

TL;DR

Abstract

Evolving Self-Assembling Neural Networks: From Spontaneous Activity to Experience-Dependent Learning

Authors

TL;DR

Abstract

Table of Contents

Figures (5)