Structurally Flexible Neural Networks: Evolving the Building Blocks for General Agents
Joachim Winther Pedersen, Erwan Plantec, Eleni Nisioti, Milton Montero, Sebastian Risi
TL;DR
Structural rigidity in RL networks limits cross-domain generalization. The authors propose Structurally Flexible Neural Networks (SFNNs), which combine sparse, parameterized neurons with GRU-based synaptic plasticity and multiple neuron/synapse types to enable a single parameter set to adapt across environments with different input/output shapes. Optimized via CMA-ES across lifetimes in three tasks, SFNNs demonstrate rapid, lifetime-based organization and generalization, outperforming ablations and symmetric baselines. This work suggests a path toward foundation-model-like RL agents capable of operating across diverse tasks without architecture re-engineering. With a network of $32$ neurons and diverse building blocks, SFNNs offer a scalable approach to flexible, environment-agnostic control.
Abstract
Artificial neural networks used for reinforcement learning are structurally rigid, meaning that each optimized parameter of the network is tied to its specific placement in the network structure. It also means that a network only works with pre-defined and fixed input- and output sizes. This is a consequence of having the number of optimized parameters being directly dependent on the structure of the network. Structural rigidity limits the ability to optimize parameters of policies across multiple environments that do not share input and output spaces. Here, we evolve a set of neurons and plastic synapses each represented by a gated recurrent unit (GRU). During optimization, the parameters of these fundamental units of a neural network are optimized in different random structural configurations. Earlier work has shown that parameter sharing between units is important for making structurally flexible neurons We show that it is possible to optimize a set of distinct neuron- and synapse types allowing for a mitigation of the symmetry dilemma. We demonstrate this by optimizing a single set of neurons and synapses to solve multiple reinforcement learning control tasks simultaneously.
