I Like To Move It -- Computation Instead of Data in the Brain
Fabian Czappa, Marvin Kaster, Felix Wolf
TL;DR
The paper tackles the scalability challenge of structural plasticity–driven brain simulations by addressing two bottlenecks: connectivity updates and spike exchange. It introduces two methods: a location-aware Barnes–Hut that moves computation to the data’s location to achieve $O(1)$ per-neuron communication in the worst case, and a firing-rate approximation that reduces synchronization by exchanging firing frequencies rather than individual spikes, controlled by an epoch length $\Delta$. Theoretical analysis and large-scale experiments show the connectivity-update time drops by up to a factor of 6, spike-exchange time by more than two orders of magnitude, and overall wall-clock time by about 78.8%, with data-transfer costs greatly reduced. These advances enable larger MSP-based brain simulations and highlight the potential for GPU acceleration to push toward more extensive, near-term whole-brain modeling, while outlining future challenges in mapping repeated Barnes–Hut computations to GPUs. The baseline cost scales as $O(n^2)$ in naïve MSP implementations, which is reduced to $O(n \log n)$ by BH techniques, and further mitigated by the proposed communication optimizations.
Abstract
The detailed functioning of the human brain is still poorly understood. Brain simulations are a well-established way to complement experimental research, but must contend with the computational demands of the approximately $10^{11}$ neurons and the $10^{14}$ synapses connecting them, the network of the latter referred to as the connectome. Studies suggest that changes in the connectome (i.e., the formation and deletion of synapses, also known as structural plasticity) are essential for critical tasks such as memory formation and learning. The connectivity update can be efficiently computed using a Barnes-Hut-inspired approximation that lowers the computational complexity from $O(n^2)$ to $O(n log n)$, where n is the number of neurons. However, updating synapses, which relies heavily on RMA, and the spike exchange between neurons, which requires all-to-all communication at every time step, still hinder scalability. We present a new algorithm that significantly reduces the communication overhead by moving computation instead of data. This shrinks the time it takes to update connectivity by a factor of six and the time it takes to exchange spikes by more than two orders of magnitude.
