Table of Contents
Fetching ...

A Bespoke Design Approach to Low-Power Printed Microprocessors for Machine Learning Applications

Panagiotis Chaidos, Giorgos Armeniakos, Sotirios Xydis, Dimitrios Soudris

TL;DR

This work tackles the challenge of deploying ML on ultra-low-power printed electronics by proposing a generic bespoke design workflow that removes unused logic and introduces a multi-precision SIMD MAC unit. They demonstrate the approach on two low-power cores (Zero-Riscy and TP-ISA) and show substantial area, power, and speed improvements, with controlled accuracy loss. A key contribution is the end-to-end workflow that guides hardware reduction, MAC integration, and RTL verification, validated across multiple ML models. The results indicate significant Pareto-optimal trade-offs, enabling more viable battery-powered printed ML accelerators, while also revealing scenario-dependent accuracy costs when competing against state-of-the-art printed processors.

Abstract

Printed electronics have gained significant traction in recent years, presenting a viable path to integrating computing into everyday items, from disposable products to low-cost healthcare. However, the adoption of computing in these domains is hindered by strict area and power constraints, limiting the effectiveness of general-purpose microprocessors. This paper proposes a bespoke microprocessor design approach to address these challenges, by tailoring the design to specific applications and eliminating unnecessary logic. Targeting machine learning applications, we further optimize core operations by integrating a SIMD MAC unit supporting 4 precision configurations that boost the efficiency of microprocessors. Our evaluation across 6 ML models and the large-scale Zero-Riscy core, shows that our methodology can achieve improvements of 22.2%, 23.6%, and 33.79% in area, power, and speed, respectively, without compromising accuracy. Against state-of-the-art printed processors, our approach can still offer significant speedups, but along with some accuracy degradation. This work explores how such trade-offs can enable low-power printed microprocessors for diverse ML applications.

A Bespoke Design Approach to Low-Power Printed Microprocessors for Machine Learning Applications

TL;DR

This work tackles the challenge of deploying ML on ultra-low-power printed electronics by proposing a generic bespoke design workflow that removes unused logic and introduces a multi-precision SIMD MAC unit. They demonstrate the approach on two low-power cores (Zero-Riscy and TP-ISA) and show substantial area, power, and speed improvements, with controlled accuracy loss. A key contribution is the end-to-end workflow that guides hardware reduction, MAC integration, and RTL verification, validated across multiple ML models. The results indicate significant Pareto-optimal trade-offs, enabling more viable battery-powered printed ML accelerators, while also revealing scenario-dependent accuracy costs when competing against state-of-the-art printed processors.

Abstract

Printed electronics have gained significant traction in recent years, presenting a viable path to integrating computing into everyday items, from disposable products to low-cost healthcare. However, the adoption of computing in these domains is hindered by strict area and power constraints, limiting the effectiveness of general-purpose microprocessors. This paper proposes a bespoke microprocessor design approach to address these challenges, by tailoring the design to specific applications and eliminating unnecessary logic. Targeting machine learning applications, we further optimize core operations by integrating a SIMD MAC unit supporting 4 precision configurations that boost the efficiency of microprocessors. Our evaluation across 6 ML models and the large-scale Zero-Riscy core, shows that our methodology can achieve improvements of 22.2%, 23.6%, and 33.79% in area, power, and speed, respectively, without compromising accuracy. Against state-of-the-art printed processors, our approach can still offer significant speedups, but along with some accuracy degradation. This work explores how such trade-offs can enable low-power printed microprocessors for diverse ML applications.

Paper Structure

This paper contains 10 sections, 1 equation, 5 figures, 2 tables.

Figures (5)

  • Figure 1: a): Baseline Area, Power and System Clock for Zero-Riscy and TP-ISA for EGFET printed technology. b): Percentage of total area and power consumption for the main functional units of Zero-Riscy, EX(Execution Unit), MUL(Multiplier), RF(Register File) and IF / ID / Ctl(Instruction Fetch, Instruction Decode and Controller units) grouped together.
  • Figure 2: Overview of proposed MAC unit. The unit has been implemented for precision options with $n$=32, 16, 8 and 4 bits. For each option, the unit can be split into 1, 2, 4 and 8 concurrent operations respectively.
  • Figure 3: Diagram of the proposed methodology for bespoke ML microprocessors.
  • Figure 4: Average Accuracy Loss per Model introduced by each Precision Option
  • Figure 5: Scatterplot of TP-ISA configurations, where d is the bits of the datapath, m signifies that the proposed MAC unit is implemented (with d bits) and p is the precision of the unit (lack of p means that the precision is the standard of the core and so there is no parallelization). The Pareto Front for Area and Speedup is highlighted in blue.