Intelligent Neural Networks: From Layered Architectures to Graph-Organized Intelligence
Antoine Salomon
TL;DR
This work introduces Intelligent Neural Networks (INN), a graph-based paradigm where neurons are intelligent agents with internal memory that communicate via learned attention on a complete graph, replacing traditional layered architectures. It demonstrates that graph topology, rather than primitive units alone, yields stable and competitive learning, with INN achieving 1.705 BPC on Text8—on par with optimized LSTMs and superior to Transformers—while a stacked Mamba baseline diverges under the same protocol. Ablation studies show that topology and selective inter-neuron routing are essential for both performance and stability, and1 intermediate analyses reveal emergent hub-like specialization enabling interpretable dynamics. WikiText-2 results corroborate robustness to lexical diversity, though word-level benchmarks reveal vocabulary bottlenecks that limit INN’s advantage in larger vocabularies. The paper argues for a new direction in neural design that prioritizes modular, interpretable, graph-structured computation and outlines clear paths for scaling and dynamic topology learning.
Abstract
Biological neurons exhibit remarkable intelligence: they maintain internal states, communicate selectively with other neurons, and self-organize into complex graphs rather than rigid hierarchical layers. What if artificial intelligence could emerge from similarly intelligent computational units? We introduce Intelligent Neural Networks (INN), a paradigm shift where neurons are first-class entities with internal memory and learned communication patterns, organized in complete graphs rather than sequential layers. Each Intelligent Neuron combines selective state-space dynamics (knowing when to activate) with attention-based routing (knowing to whom to send signals), enabling emergent computation through graph-structured interactions. On the standard Text8 character modeling benchmark, INN achieves 1.705 Bit-Per-Character (BPC), significantly outperforming a comparable Transformer (2.055 BPC) and matching a highly optimized LSTM baseline. Crucially, a parameter-matched baseline of stacked Mamba blocks fails to converge (>3.4 BPC) under the same training protocol, demonstrating that INN's graph topology provides essential training stability. Ablation studies confirm this: removing inter-neuron communication degrades performance or leads to instability, proving the value of learned neural routing. This work demonstrates that neuron-centric design with graph organization is not merely bio-inspired -- it is computationally effective, opening new directions for modular, interpretable, and scalable neural architectures.
