UltraLIF: Fully Differentiable Spiking Neural Networks via Ultradiscretization and Max-Plus Algebra
Jose Marie Antonio Miñoza
TL;DR
UltraLIF introduces a principled differentiable SNN framework by applying ultradiscretization from tropical geometry to LIF-type dynamics, replacing surrogate gradients with soft max-plus relaxations using the log-sum-exp function with learnable temperature $\varepsilon$. It yields two instantiations, UltraLIF (temporal) and UltraDLIF (spatial), with forward-backward consistency and theoretical guarantees on convergence to classical LIF/diffusion dynamics and bounded gradients. Empirically, it improves over surrogate-gradient baselines across six benchmarks, especially at single-timestep ($T=1$) on neuromorphic and temporal data, and offers an explicit sparsity mechanism to reduce energy consumption. The work links spiking computation with tropical geometry, enabling new analytical tools and potential neuromorphic deployment strategies.
Abstract
Spiking Neural Networks (SNNs) offer energy-efficient, biologically plausible computation but suffer from non-differentiable spike generation, necessitating reliance on heuristic surrogate gradients. This paper introduces UltraLIF, a principled framework that replaces surrogate gradients with ultradiscretization, a mathematical formalism from tropical geometry providing continuous relaxations of discrete dynamics. The central insight is that the max-plus semiring underlying ultradiscretization naturally models neural threshold dynamics: the log-sum-exp function serves as a differentiable soft-maximum that converges to hard thresholding as a learnable temperature parameter $\eps \to 0$. Two neuron models are derived from distinct dynamical systems: UltraLIF from the LIF ordinary differential equation (temporal dynamics) and UltraDLIF from the diffusion equation modeling gap junction coupling across neuronal populations (spatial dynamics). Both yield fully differentiable SNNs trainable via standard backpropagation with no forward-backward mismatch. Theoretical analysis establishes pointwise convergence to classical LIF dynamics with quantitative error bounds and bounded non-vanishing gradients. Experiments on six benchmarks spanning static images, neuromorphic vision, and audio demonstrate improvements over surrogate gradient baselines, with gains most pronounced in single-timestep ($T{=}1$) settings on neuromorphic and temporal datasets. An optional sparsity penalty enables significant energy reduction while maintaining competitive accuracy.
