AxOCS: Scaling FPGA-based Approximate Operators using Configuration Supersampling

Siva Satyendra Sahoo; Salim Ullah; Soumyo Bhattacharjee; Akash Kumar

AxOCS: Scaling FPGA-based Approximate Operators using Configuration Supersampling

Siva Satyendra Sahoo, Salim Ullah, Soumyo Bhattacharjee, Akash Kumar

TL;DR

The paper addresses the challenge of designing low-cost FPGA-based approximate arithmetic for embedded ML by exploiting correlations across bit-widths. It introduces AxOCS, an ML-based supersampling framework that leverages characterization data from smaller bit-width operators ($L_{CHAR}$) to generate diverse high-bit-width ($H_{CHAR}$) configurations, guided by distance-based matching and AutoML-based estimators for BEHAV and PPA. A constrained multiobjective DSE is performed, with an augmented GA that uses supersampled solutions as seed input to boost Pareto hypervolume, yielding substantial improvements over state-of-the-art approaches for 8×8 signed multipliers. The approach demonstrates scalable search acceleration, improved BEHAV/$PPA$ trade-offs on FPGA targets, and offers a pathway to application-specific operator design with future directions including alternative distance measures and sequence-to-sequence supersampling.

Abstract

The rising usage of AI and ML-based processing across application domains has exacerbated the need for low-cost ML implementation, specifically for resource-constrained embedded systems. To this end, approximate computing, an approach that explores the power, performance, area (PPA), and behavioral accuracy (BEHAV) trade-offs, has emerged as a possible solution for implementing embedded machine learning. Due to the predominance of MAC operations in ML, designing platform-specific approximate arithmetic operators forms one of the major research problems in approximate computing. Recently there has been a rising usage of AI/ML-based design space exploration techniques for implementing approximate operators. However, most of these approaches are limited to using ML-based surrogate functions for predicting the PPA and BEHAV impact of a set of related design decisions. While this approach leverages the regression capabilities of ML methods, it does not exploit the more advanced approaches in ML. To this end, we propose AxOCS, a methodology for designing approximate arithmetic operators through ML-based supersampling. Specifically, we present a method to leverage the correlation of PPA and BEHAV metrics across operators of varying bit-widths for generating larger bit-width operators. The proposed approach involves traversing the relatively smaller design space of smaller bit-width operators and employing its associated Design-PPA-BEHAV relationship to generate initial solutions for metaheuristics-based optimization for larger operators. The experimental evaluation of AxOCS for FPGA-optimized approximate operators shows that the proposed approach significantly improves the quality-resulting hypervolume for multi-objective optimization-of 8x8 signed approximate multipliers.

AxOCS: Scaling FPGA-based Approximate Operators using Configuration Supersampling

TL;DR

) to generate diverse high-bit-width (

) configurations, guided by distance-based matching and AutoML-based estimators for BEHAV and PPA. A constrained multiobjective DSE is performed, with an augmented GA that uses supersampled solutions as seed input to boost Pareto hypervolume, yielding substantial improvements over state-of-the-art approaches for 8×8 signed multipliers. The approach demonstrates scalable search acceleration, improved BEHAV/

trade-offs on FPGA targets, and offers a pathway to application-specific operator design with future directions including alternative distance measures and sequence-to-sequence supersampling.

Abstract

Paper Structure (20 sections, 3 equations, 18 figures, 2 tables)

This paper contains 20 sections, 3 equations, 18 figures, 2 tables.

Introduction
Background and Related Works
Approximate Computing
DSE for FPGA-based Approximate Operators
Operator Model
AxOCS Methodology
Statistical Analysis
Estimator Design
Similarity Analysis
Distance-based matching
Multiobjective DSE
conss
Augmented Metaheuristics-based DSE
Experiments and Results
Experiment Setup
...and 5 more sections

Figures (18)

Figure 1: k-means clustering of designs points representing approximate implementations of 8-bit and 12-bit unsigned adders
Figure 2: Variation of scaled PDPLUT and AVG_ABS_REL_ERR with UINT-encoded configuration for 8-bit and 12-bit unsigned approximate adders. The ordered (based on UINT configuration) sequence of metrics for 12-bit designs are sub-sampled to get a similar length sequence for both operators.
Figure 3: Approximating a $3-bit$ unsigned adder's FPGA implementation using selective LUT removalullah2022appaxo
Figure 4: AxOCS methodology
Figure 5: Configuration-PPA/BEHAV trends for unsigned adder AxOs
...and 13 more figures

AxOCS: Scaling FPGA-based Approximate Operators using Configuration Supersampling

TL;DR

Abstract

AxOCS: Scaling FPGA-based Approximate Operators using Configuration Supersampling

Authors

TL;DR

Abstract

Table of Contents

Figures (18)