Spiking Neural Networks for Continuous Control via End-to-End Model-Based Learning
Justus Huebotter, Pablo Lanillos, Marcel van Gerven, Serge Thill
TL;DR
This work demonstrates that fully spiking neural networks can be trained end-to-end for high-dimensional continuous control by integrating a predictive forward model with a goal-directed policy in a predictive-control framework. Using surrogate gradients and a carefully engineered architecture (including learnable time constants, adaptive thresholds, and latent-space compression), the Pred-Control SNN achieves control performance comparable to large non-spiking baselines while using markedly fewer parameters. Key findings show that prediction accuracy is not the sole determinant of control quality; an expressive yet trainable forward model that yields stable gradients suffices for effective planning and execution. The results establish SNNs as viable, scalable substrates for energy-efficient, high-DOF robotic control and provide design principles for robust end-to-end spiking controllers applicable to neuromorphic hardware.
Abstract
Despite recent progress in training spiking neural networks (SNNs) for classification, their application to continuous motor control remains limited. Here, we demonstrate that fully spiking architectures can be trained end-to-end to control robotic arms with multiple degrees of freedom in continuous environments. Our predictive-control framework combines Leaky Integrate-and-Fire dynamics with surrogate gradients, jointly optimizing a forward model for dynamics prediction and a policy network for goal-directed action. We evaluate this approach on both a planar 2D reaching task and a simulated 6-DOF Franka Emika Panda robot with torque control. In direct comparison to non-spiking recurrent baselines trained under the same predictive-control pipeline, the proposed SNN achieves comparable task performance while using substantially fewer parameters. An extensive ablation study highlights the role of initialization, learnable time constants, adaptive thresholds, and latent-space compression as key contributors to stable training and effective control. Together, these findings establish spiking neural networks as a viable and scalable substrate for high-dimensional continuous control, while emphasizing the importance of principled architectural and training design.
