Table of Contents
Fetching ...

NeuroBench: A Framework for Benchmarking Neuromorphic Computing Algorithms and Systems

Jason Yik, Korneel Van den Berghe, Douwe den Blanken, Younes Bouhadjar, Maxime Fabre, Paul Hueber, Weijie Ke, Mina A Khoei, Denis Kleyko, Noah Pacik-Nelson, Alessandro Pierro, Philipp Stratmann, Pao-Sheng Vincent Sun, Guangzhi Tang, Shenqi Wang, Biyan Zhou, Soikat Hasan Ahmed, George Vathakkattil Joseph, Benedetto Leto, Aurora Micheli, Anurag Kumar Mishra, Gregor Lenz, Tao Sun, Zergham Ahmed, Mahmoud Akl, Brian Anderson, Andreas G. Andreou, Chiara Bartolozzi, Arindam Basu, Petrut Bogdan, Sander Bohte, Sonia Buckley, Gert Cauwenberghs, Elisabetta Chicca, Federico Corradi, Guido de Croon, Andreea Danielescu, Anurag Daram, Mike Davies, Yigit Demirag, Jason Eshraghian, Tobias Fischer, Jeremy Forest, Vittorio Fra, Steve Furber, P. Michael Furlong, William Gilpin, Aditya Gilra, Hector A. Gonzalez, Giacomo Indiveri, Siddharth Joshi, Vedant Karia, Lyes Khacef, James C. Knight, Laura Kriener, Rajkumar Kubendran, Dhireesha Kudithipudi, Shih-Chii Liu, Yao-Hong Liu, Haoyuan Ma, Rajit Manohar, Josep Maria Margarit-Taulé, Christian Mayr, Konstantinos Michmizos, Dylan R. Muir, Emre Neftci, Thomas Nowotny, Fabrizio Ottati, Ayca Ozcelikkale, Priyadarshini Panda, Jongkil Park, Melika Payvand, Christian Pehle, Mihai A. Petrovici, Christoph Posch, Alpha Renner, Yulia Sandamirskaya, Clemens JS Schaefer, André van Schaik, Johannes Schemmel, Samuel Schmidgall, Catherine Schuman, Jae-sun Seo, Sadique Sheik, Sumit Bam Shrestha, Manolis Sifalakis, Amos Sironi, Kenneth Stewart, Matthew Stewart, Terrence C. Stewart, Jonathan Timcheck, Nergis Tömen, Gianvito Urgese, Marian Verhelst, Craig M. Vineyard, Bernhard Vogginger, Amirreza Yousefzadeh, Fatima Tuz Zohora, Charlotte Frenkel, Vijay Janapa Reddi

TL;DR

NeuroBench addresses the lack of standardized benchmarks in neuromorphic computing by proposing a collaborative, dual-track framework with hardware-independent algorithm benchmarks and hardware-dependent system benchmarks, supported by an open-source harness. The v1.0 suite defines four algorithm tasks—FSCIL, event-camera object detection, NHP motor prediction, and Mackey-Glass chaotic forecasting—and two system tasks (ASC and QUBO) to span edge-to-datacenter workloads, together with a comprehensive set of correctness and complexity metrics. Baseline results show that SNNs and ESNs can achieve competitive correctness with lower complexity on temporal tasks, while neuromorphic hardware often delivers superior energy efficiency compared with conventional CPUs in suitable workloads, underscoring the value of co-design across algorithm and system stacks. The framework is designed to evolve iteratively through community contributions, with open tooling, versioned benchmarks, and plans to incorporate closed-loop and embodied benchmarks to advance realistic neuromorphic evaluation and adoption.

Abstract

Neuromorphic computing shows promise for advancing computing efficiency and capabilities of AI applications using brain-inspired principles. However, the neuromorphic research field currently lacks standardized benchmarks, making it difficult to accurately measure technological advancements, compare performance with conventional methods, and identify promising future research directions. Prior neuromorphic computing benchmark efforts have not seen widespread adoption due to a lack of inclusive, actionable, and iterative benchmark design and guidelines. To address these shortcomings, we present NeuroBench: a benchmark framework for neuromorphic computing algorithms and systems. NeuroBench is a collaboratively-designed effort from an open community of researchers across industry and academia, aiming to provide a representative structure for standardizing the evaluation of neuromorphic approaches. The NeuroBench framework introduces a common set of tools and systematic methodology for inclusive benchmark measurement, delivering an objective reference framework for quantifying neuromorphic approaches in both hardware-independent (algorithm track) and hardware-dependent (system track) settings. In this article, we outline tasks and guidelines for benchmarks across multiple application domains, and present initial performance baselines across neuromorphic and conventional approaches for both benchmark tracks. NeuroBench is intended to continually expand its benchmarks and features to foster and track the progress made by the research community.

NeuroBench: A Framework for Benchmarking Neuromorphic Computing Algorithms and Systems

TL;DR

NeuroBench addresses the lack of standardized benchmarks in neuromorphic computing by proposing a collaborative, dual-track framework with hardware-independent algorithm benchmarks and hardware-dependent system benchmarks, supported by an open-source harness. The v1.0 suite defines four algorithm tasks—FSCIL, event-camera object detection, NHP motor prediction, and Mackey-Glass chaotic forecasting—and two system tasks (ASC and QUBO) to span edge-to-datacenter workloads, together with a comprehensive set of correctness and complexity metrics. Baseline results show that SNNs and ESNs can achieve competitive correctness with lower complexity on temporal tasks, while neuromorphic hardware often delivers superior energy efficiency compared with conventional CPUs in suitable workloads, underscoring the value of co-design across algorithm and system stacks. The framework is designed to evolve iteratively through community contributions, with open tooling, versioned benchmarks, and plans to incorporate closed-loop and embodied benchmarks to advance realistic neuromorphic evaluation and adoption.

Abstract

Neuromorphic computing shows promise for advancing computing efficiency and capabilities of AI applications using brain-inspired principles. However, the neuromorphic research field currently lacks standardized benchmarks, making it difficult to accurately measure technological advancements, compare performance with conventional methods, and identify promising future research directions. Prior neuromorphic computing benchmark efforts have not seen widespread adoption due to a lack of inclusive, actionable, and iterative benchmark design and guidelines. To address these shortcomings, we present NeuroBench: a benchmark framework for neuromorphic computing algorithms and systems. NeuroBench is a collaboratively-designed effort from an open community of researchers across industry and academia, aiming to provide a representative structure for standardizing the evaluation of neuromorphic approaches. The NeuroBench framework introduces a common set of tools and systematic methodology for inclusive benchmark measurement, delivering an objective reference framework for quantifying neuromorphic approaches in both hardware-independent (algorithm track) and hardware-dependent (system track) settings. In this article, we outline tasks and guidelines for benchmarks across multiple application domains, and present initial performance baselines across neuromorphic and conventional approaches for both benchmark tracks. NeuroBench is intended to continually expand its benchmarks and features to foster and track the progress made by the research community.
Paper Structure (7 sections, 24 equations, 10 figures, 7 tables, 1 algorithm)

This paper contains 7 sections, 24 equations, 10 figures, 7 tables, 1 algorithm.

Figures (10)

  • Figure 1: The two NeuroBench tracks: algorithms and systems. Grey boxes designate what is defined by the benchmark, and orange boxes indicate what is unique to each solution. Connecting arrows between the two tracks denote the co-innovation between the tracks and the cross-stack innovation enabled by this approach. Between algorithm and system solutions, best-performing results from each track can motivate future solutions to the other. In addition, system metrics and results can inform hardware-independent algorithmic complexity metrics.
  • Figure 2: An overview of the NeuroBench algorithm track.
  • Figure 3: Test accuracy per session on the keyword FSCIL task for prototypical and frozen baselines, with the accuracy on both base classes and incrementally-learned classes (left), and accuracy on all incrementally-learned classes only (right). Incremental session 0 refers to the accuracy on base classes after pre-training only. Shaded area represents 5$^{th}$ and 95$^{th}$ percentile on 100 runs. Frozen baselines with no adaptation do not learn incremental classes and thus have a fixed 0% accuracy for New Classes Performance.
  • Figure 4: Footprint and effective synaptic operations vs $R^2$, for four task baselines. Each model has two points: the solid marker represents NHP Indy, and the hollow marker represents NHP Loco.
  • Figure 5: ESN and LSTM models evaluated on varying Mackey-Glass time series using a constant set of hyperparameters.
  • ...and 5 more figures