Table of Contents
Fetching ...

Shavette: Low Power Neural Network Acceleration via Algorithm-level Error Detection and Undervolting

Mikael Rinkinen, Lauri Koskinen, Olli Silven, Mehdi Safarpour

TL;DR

The paper addresses energy efficiency in DNN accelerators by reducing voltage margins while maintaining accuracy, leveraging a software-only approach. It introduces Shavette, which combines Algorithm-Based Fault Tolerance (ABFT) for linear layers with a Double Redundant Module (DMR) for non-linear layers to enable safe undervolting without hardware modifications. Experiments on LeNet and VGG-16 running on a GPU show energy savings of about 18% to 25% per inference with negligible accuracy loss and modest throughput impact. The work demonstrates that ABFT-based undervolting is a cost-effective alternative to hardware-based techniques and can be applied to commodity devices with low runtime overhead and high practical impact.

Abstract

Reduced voltage operation is an effective technique for substantial energy efficiency improvement in digital circuits. This brief introduces a simple approach for enabling reduced voltage operation of Deep Neural Network (DNN) accelerators by mere software modifications. Conventional approaches for enabling reduced voltage operation e.g., Timing Error Detection (TED) systems, incur significant development costs and overheads, while not being applicable to the off-the-shelf components. Contrary to those, the solution proposed in this paper relies on algorithm-based error detection, and hence, is implemented with low development costs, does not require any circuit modifications, and is even applicable to commodity devices. By showcasing the solution through experimenting on popular DNNs, i.e., LeNet and VGG16, on a GPU platform, we demonstrate 18% to 25% energy saving with no accuracy loss of the models and negligible throughput compromise (< 3.9%), considering the overheads from integration of the error detection schemes into the DNN. The integration of presented algorithmic solution into the design is simpler when compared conventional TED based techniques that require extensive circuit-level modifications, cell library characterizations or special support from the design tools.

Shavette: Low Power Neural Network Acceleration via Algorithm-level Error Detection and Undervolting

TL;DR

The paper addresses energy efficiency in DNN accelerators by reducing voltage margins while maintaining accuracy, leveraging a software-only approach. It introduces Shavette, which combines Algorithm-Based Fault Tolerance (ABFT) for linear layers with a Double Redundant Module (DMR) for non-linear layers to enable safe undervolting without hardware modifications. Experiments on LeNet and VGG-16 running on a GPU show energy savings of about 18% to 25% per inference with negligible accuracy loss and modest throughput impact. The work demonstrates that ABFT-based undervolting is a cost-effective alternative to hardware-based techniques and can be applied to commodity devices with low runtime overhead and high practical impact.

Abstract

Reduced voltage operation is an effective technique for substantial energy efficiency improvement in digital circuits. This brief introduces a simple approach for enabling reduced voltage operation of Deep Neural Network (DNN) accelerators by mere software modifications. Conventional approaches for enabling reduced voltage operation e.g., Timing Error Detection (TED) systems, incur significant development costs and overheads, while not being applicable to the off-the-shelf components. Contrary to those, the solution proposed in this paper relies on algorithm-based error detection, and hence, is implemented with low development costs, does not require any circuit modifications, and is even applicable to commodity devices. By showcasing the solution through experimenting on popular DNNs, i.e., LeNet and VGG16, on a GPU platform, we demonstrate 18% to 25% energy saving with no accuracy loss of the models and negligible throughput compromise (< 3.9%), considering the overheads from integration of the error detection schemes into the DNN. The integration of presented algorithmic solution into the design is simpler when compared conventional TED based techniques that require extensive circuit-level modifications, cell library characterizations or special support from the design tools.

Paper Structure

This paper contains 11 sections, 4 equations, 5 figures, 3 tables, 1 algorithm.

Figures (5)

  • Figure 1: Simplified concept of Timing-Error-Detection circuit safarpour2021algorithm.
  • Figure 2: ABFT scheme for vector in matrix multiplications.
  • Figure 3: ABFT scheme for convolutions.
  • Figure 4: Power consumption with ABFT enabled and ABFT disabled for both VGG-16 at 1780 MHz and 1680 MHz clock frequencies with respect to voltage. The PoFF and crash points are marked for each.
  • Figure 5: The voltage is reduced from detail down to the crash point. The ABFT detect errors in computations. Notice no accuracy loss is observed despite ABFT detections which is due to inherent fault tolerance of DNNs.