On the Systematic Creation of Faithfully Rounded Commutative Truncated Booth Multipliers

Theo Drane; Samuel Coward; Mertcan Temel; Joe Leslie-Hurd

On the Systematic Creation of Faithfully Rounded Commutative Truncated Booth Multipliers

Theo Drane, Samuel Coward, Mertcan Temel, Joe Leslie-Hurd

TL;DR

The paper addresses faithful rounding in fixed-point multiplication and the non-commutativity introduced by truncating Booth arrays. It proposes a commutative truncated Booth multiplier that inserts a small set of compensation bits and truncates the double Booth-encoded sum, backed by tight bounds on the truncation error $\Delta$ guaranteeing faithful rounding via $C-2^n < \Delta < C+2^k\max(\Delta)-2^k$ with $C$ a multiple of $2^k$. It derives the optimal truncation depth $k^*$ and corresponding constant $C^*$, and provides a step-by-step hardware construction and synthesis results showing up to 31% area and 38% power savings over a baseline, and up to 13% area and 12% power savings over a truncated-AND design. ACL2-based formal verification confirms both faithfulness and commutativity up to $n=42$, though verification for larger widths scales poorly. These results enable smaller, lower-power, faithfully rounded multipliers and pave the way for broader Booth-based truncation schemes.

Abstract

In many instances of fixed-point multiplication, a full precision result is not required. Instead it is sufficient to return a faithfully rounded result. Faithful rounding permits the machine representable number either immediately above or below the full precision result, if the latter is not exactly representable. Multipliers which take full advantage of this freedom can be implemented using less circuit area and consuming less power. The most common implementations internally truncate the partial product array. However, truncation applied to the most common of multiplier architectures, namely Booth architectures, results in non-commutative implementations. The industrial adoption of truncated multipliers is limited by the absence of formal verification of such implementations, since exhaustive simulation is typically infeasible. We present a commutative truncated Booth multiplier architecture and derive closed form necessary and sufficient conditions for faithful rounding. We also provide the bit-vectors giving rise to the worst-case error. We present a formal verification methodology based on ACL2 which scales up to 42 bit multipliers. We synthesize a range of commutative faithfully rounded multipliers and show that truncated booth implementations are up to 31% smaller than externally truncated multipliers.

On the Systematic Creation of Faithfully Rounded Commutative Truncated Booth Multipliers

TL;DR

guaranteeing faithful rounding via

with

a multiple of

. It derives the optimal truncation depth

and corresponding constant

, and provides a step-by-step hardware construction and synthesis results showing up to 31% area and 38% power savings over a baseline, and up to 13% area and 12% power savings over a truncated-AND design. ACL2-based formal verification confirms both faithfulness and commutativity up to

, though verification for larger widths scales poorly. These results enable smaller, lower-power, faithfully rounded multipliers and pave the way for broader Booth-based truncation schemes.

Abstract

Paper Structure (12 sections, 1 theorem, 17 equations, 3 figures, 5 tables)

This paper contains 12 sections, 1 theorem, 17 equations, 3 figures, 5 tables.

Introduction
Background
Binary Multiplication
Faithfully Rounded Binary Multiplication
Methodology
Commutative Truncated Booth Arrays
The Truncation Error
Faithfully Rounded Commutative Truncated Booth Arrays
Results
Synthesis
Formal Verification
Conclusion

Key Result

Lemma 1

Extremal values of $\Delta$ occur when $a_{i}\neq a_{i-2}$ and $b_{j}\neq b_{j-2}$ for all $n>i,j>1$.

Figures (3)

Figure 1: A 16-bit $a\times b$, implemented as a traditional partial product array, where each partial product bit is $a[i]\& b[j]$. From this array we truncate 12 columns and insert the compensation constant, 11 (red). Summing the truncated array (black) and discarding the light blue bits produces a faithfully rounded 16-bit multiplication result.
Figure 2: A 16-bit commutative truncated Booth multiplier, with 12 columns of truncation. The six red bits are the additional compensation bits $s_i'$. The pink bits represent the typical Booth sign bits $s_i$.
Figure 3: Area-delay profiles for the three competing commutative designs for n=64. We also plot the truncated Booth architecture without the compensation bits to recover commutativity.

Theorems & Definitions (2)

Lemma 1
proof

On the Systematic Creation of Faithfully Rounded Commutative Truncated Booth Multipliers

TL;DR

Abstract

On the Systematic Creation of Faithfully Rounded Commutative Truncated Booth Multipliers

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (2)