Table of Contents
Fetching ...

MLPerf Tiny Benchmark

Colby Banbury, Vijay Janapa Reddi, Peter Torelli, Jeremy Holleman, Nat Jeffries, Csaba Kiraly, Pietro Montino, David Kanter, Sebastian Ahmed, Danilo Pau, Urmish Thakker, Antonio Torrini, Peter Warden, Jay Cordaro, Giuseppe Di Guglielmo, Javier Duarte, Stephen Gibellini, Videet Parekh, Honson Tran, Nhan Tran, Niu Wenxu, Xu Xuesong

TL;DR

MLPerf Tiny tackles the lack of a reproducible, energy-aware benchmark for ultra-low-power TinyML by introducing an open, modular suite with four inference tasks. It prescribes datasets, models, and quality targets, plus reference implementations and a two-division submission scheme (closed and open) to balance fair comparison with innovation. The framework enables standardized measurements of accuracy, latency, and energy, demonstrated by the first round of community submissions that reveal trends in 8-bit quantization, software/hardware diversity, and FPGA-enabled variability. Collectively, MLPerf Tiny aims to harmonize TinyML evaluation, spur progress across hardware and software stacks, and guide responsible deployment in resource-constrained environments.

Abstract

Advancements in ultra-low-power tiny machine learning (TinyML) systems promise to unlock an entirely new class of smart applications. However, continued progress is limited by the lack of a widely accepted and easily reproducible benchmark for these systems. To meet this need, we present MLPerf Tiny, the first industry-standard benchmark suite for ultra-low-power tiny machine learning systems. The benchmark suite is the collaborative effort of more than 50 organizations from industry and academia and reflects the needs of the community. MLPerf Tiny measures the accuracy, latency, and energy of machine learning inference to properly evaluate the tradeoffs between systems. Additionally, MLPerf Tiny implements a modular design that enables benchmark submitters to show the benefits of their product, regardless of where it falls on the ML deployment stack, in a fair and reproducible manner. The suite features four benchmarks: keyword spotting, visual wake words, image classification, and anomaly detection.

MLPerf Tiny Benchmark

TL;DR

MLPerf Tiny tackles the lack of a reproducible, energy-aware benchmark for ultra-low-power TinyML by introducing an open, modular suite with four inference tasks. It prescribes datasets, models, and quality targets, plus reference implementations and a two-division submission scheme (closed and open) to balance fair comparison with innovation. The framework enables standardized measurements of accuracy, latency, and energy, demonstrated by the first round of community submissions that reveal trends in 8-bit quantization, software/hardware diversity, and FPGA-enabled variability. Collectively, MLPerf Tiny aims to harmonize TinyML evaluation, spur progress across hardware and software stacks, and guide responsible deployment in resource-constrained environments.

Abstract

Advancements in ultra-low-power tiny machine learning (TinyML) systems promise to unlock an entirely new class of smart applications. However, continued progress is limited by the lack of a widely accepted and easily reproducible benchmark for these systems. To meet this need, we present MLPerf Tiny, the first industry-standard benchmark suite for ultra-low-power tiny machine learning systems. The benchmark suite is the collaborative effort of more than 50 organizations from industry and academia and reflects the needs of the community. MLPerf Tiny measures the accuracy, latency, and energy of machine learning inference to properly evaluate the tradeoffs between systems. Additionally, MLPerf Tiny implements a modular design that enables benchmark submitters to show the benefits of their product, regardless of where it falls on the ML deployment stack, in a fair and reproducible manner. The suite features four benchmarks: keyword spotting, visual wake words, image classification, and anomaly detection.

Paper Structure

This paper contains 22 sections, 6 figures, 2 tables.

Figures (6)

  • Figure 1: Summary of the Tiny Machine Learning Stack. There is diversity at every level which makes standardization for benchmarking challenging.
  • Figure 2: The modular design of MLPerf Tiny enables both the direct comparison of solutions and the demonstration of an improvement over the reference. The reference implementations are fully implemented solutions that allow individual components to be swapped out. The components in green can be modified in either division and the orange components can only be modified in the open division. The reference implementations also act as the baseline the results.
  • Figure 3: The two configuration modes of the benchmark framework for (a.) latency and accuracy measurement, or (b.) energy measurement.
  • Figure 4: The graphical user interface (GUI) for the benchmark runner.
  • Figure 5: The energy and latency results of the reference implementations. Each reference implementation was run on the NUCLEO-L4R5ZI board which is shown in the top right.
  • ...and 1 more figures