Table of Contents
Fetching ...

Bespoke Co-processor for Energy-Efficient Health Monitoring on RISC-V-based Flexible Wearables

Theofanis Vergos, Polykarpos Vergos, Mehdi B. Tahoori, Georgios Zervakis

TL;DR

Flexible wearables demand energy-efficient ML under tight area and power budgets. The authors design a bespoke MAC co-processor integrated with a compact RISC-V SERV core and an automated constraint-programming flow to fix coefficients and map MLP inference, achieving near-real-time performance. Their CP-SAT–driven optimization jointly selects constants and decomposition strategies to minimize latency $L$ under a multiplier budget, with automatic Verilog co-processor generation. Post-layout results across healthcare datasets show latency $<1$ s, energy per inference $0.48$ mJ, and area $2.42$ mm$^2$, delivering up to $2.35\times$ speedup and $2.15\times$ energy savings over the state of the art, validating the approach for conformable wearables.

Abstract

Flexible electronics offer unique advantages for conformable, lightweight, and disposable healthcare wearables. However, their limited gate count, large feature sizes, and high static power consumption make on-body machine learning classification highly challenging. While existing bendable RISC-V systems provide compact solutions, they lack the energy efficiency required. We present a mechanically flexible RISC-V that integrates a bespoke multiply-accumulate co-processor with fixed coefficients to maximize energy efficiency and minimize latency. Our approach formulates a constrained programming problem to jointly determine co-processor constants and optimally map Multi-Layer Perceptron (MLP) inference operations, enabling compact, model-specific hardware by leveraging the low fabrication and non-recurring engineering costs of flexible technologies. Post-layout results demonstrate near-real-time performance across several healthcare datasets, with our circuits operating within the power budget of existing flexible batteries and occupying only 2.42 mm^2, offering a promising path toward accessible, sustainable, and conformable healthcare wearables. Our microprocessors achieve an average 2.35x speedup and 2.15x lower energy consumption compared to the state of the art.

Bespoke Co-processor for Energy-Efficient Health Monitoring on RISC-V-based Flexible Wearables

TL;DR

Flexible wearables demand energy-efficient ML under tight area and power budgets. The authors design a bespoke MAC co-processor integrated with a compact RISC-V SERV core and an automated constraint-programming flow to fix coefficients and map MLP inference, achieving near-real-time performance. Their CP-SAT–driven optimization jointly selects constants and decomposition strategies to minimize latency under a multiplier budget, with automatic Verilog co-processor generation. Post-layout results across healthcare datasets show latency s, energy per inference mJ, and area mm, delivering up to speedup and energy savings over the state of the art, validating the approach for conformable wearables.

Abstract

Flexible electronics offer unique advantages for conformable, lightweight, and disposable healthcare wearables. However, their limited gate count, large feature sizes, and high static power consumption make on-body machine learning classification highly challenging. While existing bendable RISC-V systems provide compact solutions, they lack the energy efficiency required. We present a mechanically flexible RISC-V that integrates a bespoke multiply-accumulate co-processor with fixed coefficients to maximize energy efficiency and minimize latency. Our approach formulates a constrained programming problem to jointly determine co-processor constants and optimally map Multi-Layer Perceptron (MLP) inference operations, enabling compact, model-specific hardware by leveraging the low fabrication and non-recurring engineering costs of flexible technologies. Post-layout results demonstrate near-real-time performance across several healthcare datasets, with our circuits operating within the power budget of existing flexible batteries and occupying only 2.42 mm^2, offering a promising path toward accessible, sustainable, and conformable healthcare wearables. Our microprocessors achieve an average 2.35x speedup and 2.15x lower energy consumption compared to the state of the art.

Paper Structure

This paper contains 13 sections, 1 equation, 5 figures, 3 tables, 1 algorithm.

Figures (5)

  • Figure 1: Our flexible RISC-V-based systems comprising a SERV core and a bespoke co-processor.
  • Figure 2: Area of signed bespoke multipliers in FlexIC technology flexic_gen3 for varying constants. For reference, conventional $4$x$4$-bit and $8$x$4$-bit multipliers occupy $0.020$ mm$^2$ and $0.038$ mm$^2$, respectively. Post-synthesis results are reported.
  • Figure 3: Monte Carlo analysis of (a) area and (b) power of our bespoke co-processor as a function of the number of bespoke multipliers. Results are on $200$ post-synthesis samples with $4$-bit inputs and random constants in $[-8, 7]$. The red line refers to a co-processor with eight $4$x$4$ conventional multipliers.
  • Figure 4: Cycle-level example of neuron output accumulation.
  • Figure 5: Layout of (a) SERV (non-accelerated), (b) Flex-RV ozer:nature2024:bendableRiscV, (c) Semi-bespoke w/ eight $4$x$4$ multipliers, and ours: (d) AffectiveRoad, (e) Arrhythmia, (f) Dermatology, (g) DriveDB, (h) ECG5000, (i) HAR, (j) SPD, (k) StressInNurses, and (l) WESAD. Red regions highlight the co-processor.