Table of Contents
Fetching ...

Sensitivity-Guided Framework for Pruned and Quantized Reservoir Computing Accelerators

Atousa Jafari, Mahdi Taheri, Hassan Ghasemzadeh Mohammadi, Christian Herglotz, Marco Platzner

TL;DR

Experimental results across selected benchmarks demonstrate that the proposed approach maintains high accuracy while substantially improving computational and resource efficiency in FPGA-based implementations, with variations observed across different configurations and time series applications.

Abstract

This paper presents a compression framework for Reservoir Computing that enables systematic design-space exploration of trade-offs among quantization levels, pruning rates, model accuracy, and hardware efficiency. The proposed approach leverages a sensitivity-based pruning mechanism to identify and remove less critical quantized weights with minimal impact on model accuracy, thereby reducing computational overhead while preserving accuracy. We perform an extensive trade-off analysis to validate the effectiveness of the proposed framework and the impact of pruning and quantization on model performance and hardware parameters. For this evaluation, we employ three time-series datasets, including both classification and regression tasks. Experimental results across selected benchmarks demonstrate that our proposed approach maintains high accuracy while substantially improving computational and resource efficiency in FPGA-based implementations, with variations observed across different configurations and time series applications. For instance, for the MELBOEN dataset, an accelerator quantized to 4-bit at a 15\% pruning rate reduces resource utilization by 1.2\% and the Power Delay Product (PDP) by 50.8\% compared to an unpruned model, without any noticeable degradation in accuracy.

Sensitivity-Guided Framework for Pruned and Quantized Reservoir Computing Accelerators

TL;DR

Experimental results across selected benchmarks demonstrate that the proposed approach maintains high accuracy while substantially improving computational and resource efficiency in FPGA-based implementations, with variations observed across different configurations and time series applications.

Abstract

This paper presents a compression framework for Reservoir Computing that enables systematic design-space exploration of trade-offs among quantization levels, pruning rates, model accuracy, and hardware efficiency. The proposed approach leverages a sensitivity-based pruning mechanism to identify and remove less critical quantized weights with minimal impact on model accuracy, thereby reducing computational overhead while preserving accuracy. We perform an extensive trade-off analysis to validate the effectiveness of the proposed framework and the impact of pruning and quantization on model performance and hardware parameters. For this evaluation, we employ three time-series datasets, including both classification and regression tasks. Experimental results across selected benchmarks demonstrate that our proposed approach maintains high accuracy while substantially improving computational and resource efficiency in FPGA-based implementations, with variations observed across different configurations and time series applications. For instance, for the MELBOEN dataset, an accelerator quantized to 4-bit at a 15\% pruning rate reduces resource utilization by 1.2\% and the Power Delay Product (PDP) by 50.8\% compared to an unpruned model, without any noticeable degradation in accuracy.
Paper Structure (12 sections, 5 equations, 5 figures, 3 tables, 1 algorithm)

This paper contains 12 sections, 5 equations, 5 figures, 3 tables, 1 algorithm.

Figures (5)

  • Figure 1: Reservoir computing architecture consists of three layers: input, reservoir, and output.
  • Figure 2: Overview of our RC accelerator synthesis framework, including sensitivity-guided pruning.
  • Figure 3: Performance evaluation for various quantization levels and pruning rates across selected time-series datasets.
  • Figure 4: MELBORN
  • Figure 5: HENON