GAVINA: flexible aggressive undervolting for bit-serial mixed-precision DNN acceleration

Jordi Fornt; Pau Fontova-Musté; Adrian Gras; Omar Lahyani; Martí Caro; Jaume Abella; Francesc Moll; Josep Altet

GAVINA: flexible aggressive undervolting for bit-serial mixed-precision DNN acceleration

Jordi Fornt, Pau Fontova-Musté, Adrian Gras, Omar Lahyani, Martí Caro, Jaume Abella, Francesc Moll, Josep Altet

TL;DR

The paper tackles the energy efficiency challenge in DNN hardware by combining Guarded Aggressive undervolting (GAV) with bit-serial, mixed-precision computation to enable flexible, low-power operation. It introduces the GAVINA accelerator, detailing its architecture, a two-domain voltage strategy, and a loss-aware error model that supports per-layer voltage optimization. Through a physical 12 nm design, an undervolting model calibrated to GLS data, and CIFAR-10/ResNet-18 benchmarks, the work demonstrates up to 89 TOP/sW energy efficiency and up to 20% energy savings with negligible accuracy loss. It also positions GAVINA favorably against state-of-the-art, highlighting the benefits of combining mixed precision, bit-serial computation, and controlled undervolting for scalable, energy-efficient DNN inference.

Abstract

Voltage overscaling, or undervolting, is an enticing approximate technique in the context of energy-efficient Deep Neural Network (DNN) acceleration, given the quadratic relationship between power and voltage. Nevertheless, its very high error rate has thwarted its general adoption. Moreover, recent undervolting accelerators rely on 8-bit arithmetic and cannot compete with state-of-the-art low-precision (<8b) architectures. To overcome these issues, we propose a new technique called Guarded Aggressive underVolting (GAV), which combines the ideas of undervolting and bit-serial computation to create a flexible approximation method based on aggressively lowering the supply voltage on a select number of least significant bit combinations. Based on this idea, we implement GAVINA (GAV mIxed-precisioN Accelerator), a novel architecture that supports arbitrary mixed precision and flexible undervolting, with an energy efficiency of up to 89 TOP/sW in its most aggressive configuration. By developing an error model of GAVINA, we show that GAV can achieve an energy efficiency boost of 20% via undervolting, with negligible accuracy degradation on ResNet-18.

GAVINA: flexible aggressive undervolting for bit-serial mixed-precision DNN acceleration

TL;DR

Abstract

GAVINA: flexible aggressive undervolting for bit-serial mixed-precision DNN acceleration

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (8)