Table of Contents
Fetching ...

Optimizing Energy Efficiency in Subthreshold RISC-V Cores

Asbjørn Djupdal, Magnus Själander, Magnus Jahre, Snorre Aunet, Trond Ytterdal

TL;DR

Subthreshold operation offers minimal energy consumption for microcontrollers, but designing energy-efficient, standard-ISA cores requires careful microarchitectural choices. The authors create a dedicated subthreshold library for a 130 nm process and compare six open-source RISC-V cores across multi-cycle and pipelined implementations at 300 mV, revealing that a two-stage pipeline (Vex-2) delivers the best overall energy efficiency due to low CPI and reduced buffering overhead. Deeper pipelines incur significant area and power penalties without commensurate runtime gains, while multi-cycle cores can be Pareto-optimal only under tight power or area constraints. The work demonstrates that a constrained, subthreshold-friendly pipeline depth is a practical design principle for energy-efficient standard-ISA microcontrollers and provides a reproducible evaluation workflow. This has implications for ultra-low-power IoT and biomedical devices that rely on energy harvesting and tight energy budgets.

Abstract

Our goal in this paper is to understand how to maximize energy efficiency when designing standard-ISA processor cores for subthreshold operation. We hence develop a custom subthreshold library and use it to synthesize the open-source RISC-V cores SERV, QERV, PicoRV32, Ibex, Rocket, and two variants of Vex, targeting a supply voltage of 300 mV in a commercial 130 nm process. SERV, QERV, and PicoRV32 are multi-cycle architectures, while Ibex, Vex, and Rocket are pipelined architectures. We find that SERV, QERV, PicoRV32, and Vex are Pareto optimal in one or more of performance, power, and area. The 2-stage Vex (Vex-2) is the most energy efficient core overall, mainly because it uses fewer cycles per instruction than multi-cycle SERV, QERV, and PicoRV32 while retaining similar power consumption. Pipelining increases core area, and we observe that for subthreshold operation, the longer wires of pipelined designs require adding buffers to maintain a cycle time that is low enough to achieve high energy efficiency. These buffers limit the performance gains achievable by deeper pipelining because they result in cycle time no longer scaling proportionally with pipeline stages. The added buffers and the additional area required for pipelining logic however increase power consumption, and Vex-2 therefore provides similar performance and lower power consumption than the 5-stage cores Vex-5 and Rocket. A key contribution of this paper is therefore to demonstrate that limited-depth pipelined RISC-V designs hit the sweet spot in balancing performance and power consumption when optimizing for energy efficiency in subthreshold operation.

Optimizing Energy Efficiency in Subthreshold RISC-V Cores

TL;DR

Subthreshold operation offers minimal energy consumption for microcontrollers, but designing energy-efficient, standard-ISA cores requires careful microarchitectural choices. The authors create a dedicated subthreshold library for a 130 nm process and compare six open-source RISC-V cores across multi-cycle and pipelined implementations at 300 mV, revealing that a two-stage pipeline (Vex-2) delivers the best overall energy efficiency due to low CPI and reduced buffering overhead. Deeper pipelines incur significant area and power penalties without commensurate runtime gains, while multi-cycle cores can be Pareto-optimal only under tight power or area constraints. The work demonstrates that a constrained, subthreshold-friendly pipeline depth is a practical design principle for energy-efficient standard-ISA microcontrollers and provides a reproducible evaluation workflow. This has implications for ultra-low-power IoT and biomedical devices that rely on energy harvesting and tight energy budgets.

Abstract

Our goal in this paper is to understand how to maximize energy efficiency when designing standard-ISA processor cores for subthreshold operation. We hence develop a custom subthreshold library and use it to synthesize the open-source RISC-V cores SERV, QERV, PicoRV32, Ibex, Rocket, and two variants of Vex, targeting a supply voltage of 300 mV in a commercial 130 nm process. SERV, QERV, and PicoRV32 are multi-cycle architectures, while Ibex, Vex, and Rocket are pipelined architectures. We find that SERV, QERV, PicoRV32, and Vex are Pareto optimal in one or more of performance, power, and area. The 2-stage Vex (Vex-2) is the most energy efficient core overall, mainly because it uses fewer cycles per instruction than multi-cycle SERV, QERV, and PicoRV32 while retaining similar power consumption. Pipelining increases core area, and we observe that for subthreshold operation, the longer wires of pipelined designs require adding buffers to maintain a cycle time that is low enough to achieve high energy efficiency. These buffers limit the performance gains achievable by deeper pipelining because they result in cycle time no longer scaling proportionally with pipeline stages. The added buffers and the additional area required for pipelining logic however increase power consumption, and Vex-2 therefore provides similar performance and lower power consumption than the 5-stage cores Vex-5 and Rocket. A key contribution of this paper is therefore to demonstrate that limited-depth pipelined RISC-V designs hit the sweet spot in balancing performance and power consumption when optimizing for energy efficiency in subthreshold operation.

Paper Structure

This paper contains 11 sections, 4 figures, 5 tables.

Figures (4)

  • Figure 1: Imbalance $i$ versus voltage for across temperatures.
  • Figure 2: Ultra-low-power processor microarchitecture design options. SERV, QERV, and PicoRV32 are multi-cycle architectures (a), Ibex and Vex-2 are 2-stage pipelined architectures (b), and Vex-5 and Rocket are 5-stage pipelined architectures (c).
  • Figure 3: Power consumption, execution time, and Energy Per Instruction (EPI) across benchmarks and cores.
  • Figure 4: Pareto-optimal designs when optimizing for energy efficiency (EPI) and runtime (a), power (b), or area (c).