Table of Contents
Fetching ...

Sparse Spiking Neural-like Membrane Systems on Graphics Processing Units

Javier Hernández-Tello, Miguel Ángel Martínez-del-Amor, David Orellana-Martín, Francis George C. Cabarle

TL;DR

This work introduces GPU-accelerated, compressed sparse matrix representations for Spiking Neural P systems (SNPs) to mitigate sparsity in the transition graph. By implementing and parallelizing two compression schemes (ELL and Compressed/Optimized) on GPUs, the authors demonstrate substantial speedups and memory savings over dense baselines and state-of-the-art GPU libraries, enabling SNP simulations with delays on very large graphs. Across two benchmarks (sorting natural numbers and subset sum), the Compressed design delivers up to 83× speedup over Sparse and enables large-scale instances on an A100 80GB, highlighting the practical impact for high-performance SNP simulations and paving the way for broader framework development and variant extensions.

Abstract

The parallel simulation of Spiking Neural P systems is mainly based on a matrix representation, where the graph inherent to the neural model is encoded in an adjacency matrix. The simulation algorithm is based on a matrix-vector multiplication, which is an operation efficiently implemented on parallel devices. However, when the graph of a Spiking Neural P system is not fully connected, the adjacency matrix is sparse and hence, lots of computing resources are wasted in both time and memory domains. For this reason, two compression methods for the matrix representation were proposed in a previous work, but they were not implemented nor parallelized on a simulator. In this paper, they are implemented and parallelized on GPUs as part of a new Spiking Neural P system with delays simulator. Extensive experiments are conducted on high-end GPUs (RTX2080 and A100 80GB), and it is concluded that they outperform other solutions based on state-of-the-art GPU libraries when simulating Spiking Neural P systems.

Sparse Spiking Neural-like Membrane Systems on Graphics Processing Units

TL;DR

This work introduces GPU-accelerated, compressed sparse matrix representations for Spiking Neural P systems (SNPs) to mitigate sparsity in the transition graph. By implementing and parallelizing two compression schemes (ELL and Compressed/Optimized) on GPUs, the authors demonstrate substantial speedups and memory savings over dense baselines and state-of-the-art GPU libraries, enabling SNP simulations with delays on very large graphs. Across two benchmarks (sorting natural numbers and subset sum), the Compressed design delivers up to 83× speedup over Sparse and enables large-scale instances on an A100 80GB, highlighting the practical impact for high-performance SNP simulations and paving the way for broader framework development and variant extensions.

Abstract

The parallel simulation of Spiking Neural P systems is mainly based on a matrix representation, where the graph inherent to the neural model is encoded in an adjacency matrix. The simulation algorithm is based on a matrix-vector multiplication, which is an operation efficiently implemented on parallel devices. However, when the graph of a Spiking Neural P system is not fully connected, the adjacency matrix is sparse and hence, lots of computing resources are wasted in both time and memory domains. For this reason, two compression methods for the matrix representation were proposed in a previous work, but they were not implemented nor parallelized on a simulator. In this paper, they are implemented and parallelized on GPUs as part of a new Spiking Neural P system with delays simulator. Extensive experiments are conducted on high-end GPUs (RTX2080 and A100 80GB), and it is concluded that they outperform other solutions based on state-of-the-art GPU libraries when simulating Spiking Neural P systems.
Paper Structure (14 sections, 6 figures, 3 tables, 5 algorithms)

This paper contains 14 sections, 6 figures, 3 tables, 5 algorithms.

Figures (6)

  • Figure 1: Execution time for the SNP systems sorting natural numbers on a RTX2080. X-axis shows the amount of natural numbers to sort (initially in descending order). Y-axis shows the time in ms using log scale.
  • Figure 2: Memory consumption for the SNP systems sorting natural numbers. X-axis shows the amount of natural numbers to sort (initially in descending order). Y-axis shows the consumed memory in MB.
  • Figure 3: Execution time for the SNP systems solving subset sum on a RTX2080. X-axis shows the size of the instance, measured as the size of the input set of numbers for the problem. The time is measured in ms.
  • Figure 4: Consumed GPU memory for SNP systems solving subset sum on a RTX2080. X-axis shows the size of the instance, with memory measured in MB.
  • Figure 5: Execution time for the SNP systems solving subset sum on an A100 80Gb. X-axis shows the instance size, with time measured in ms and shown at a log scale.
  • ...and 1 more figures