Biological Neurons Compete with Deep Reinforcement Learning in Sample Efficiency in a Simulated Gameworld
Moein Khajehnejad, Forough Habibollahi, Aswin Paul, Adeel Razi, Brett J. Kagan
TL;DR
The study compares sample efficiency between DishBrain in vitro neural cultures and three deep reinforcement learning algorithms (DQN, A2C, PPO) in a Pong-like task under identical real-time sample budgets. Biological cultures demonstrate superior learning speed and performance across multiple input densities, suggesting higher sample efficiency than contemporary RL methods. The discussion situates these results within broader debates on biologically plausible learning mechanisms and highlights active inference as a promising, biologically inspired alternative. Methodologically, the work combines a high-density MEA-based closed-loop platform with varied input encodings and extensive RL hyperparameter exploration, pointing to SBI systems as a compelling direction for real-time, energy-efficient learning with potential implications for AI algorithm development.
Abstract
How do biological systems and machine learning algorithms compare in the number of samples required to show significant improvements in completing a task? We compared the learning efficiency of in vitro biological neural networks to the state-of-the-art deep reinforcement learning (RL) algorithms in a simplified simulation of the game `Pong'. Using DishBrain, a system that embodies in vitro neural networks with in silico computation using a high-density multi-electrode array, we contrasted the learning rate and the performance of these biological systems against time-matched learning from three state-of-the-art deep RL algorithms (i.e., DQN, A2C, and PPO) in the same game environment. This allowed a meaningful comparison between biological neural systems and deep RL. We find that when samples are limited to a real-world time course, even these very simple biological cultures outperformed deep RL algorithms across various game performance characteristics, implying a higher sample efficiency. Ultimately, even when tested across multiple types of information input to assess the impact of higher dimensional data input, biological neurons showcased faster learning than all deep reinforcement learning agents.
