Table of Contents
Fetching ...

Design and Implementation of a Takum Arithmetic Hardware Codec

Laslo Hunhold

TL;DR

The paper tackles the limitations of existing floating-point and posit formats by introducing a hardware Takum codec for both logarithmic Takums (LNS) and linear Takums, underpinned by a novel internal LNS representation. It provides an efficient, open-source VHDL implementation optimized for FPGA, and a detailed encoder/decoder architecture leveraging a bounded exponent and compact preprocessing to achieve strong hardware efficiency. Empirical results on a Kintex UltraScale+ FPGA show Takum decoders outperform state-of-the-art posit codecs by up to 38% in latency and up to 50% in LUTs, while encoders reach up to 13% lower latency with similar resource use. The work suggests Takums offer practical benefits for mixed-precision numerical computing, with clear directions for VLSI and full-APU integration as future work, including quire considerations and exploration of the chosen base $\sqrt{e}$ in the logarithmic form.

Abstract

The takum machine number format has been recently proposed as an enhancement over the posit number format, which is considered a promising alternative to the IEEE 754 floating-point standard. Takums retain the useful posit properties, but feature a novel exponent coding scheme that yields more precision for small and large magnitude numbers and a much higher and bounded dynamic range. This paper presents the design and implementation of a hardware codec for both takums (logarithmic number system, LNS) and linear takums (floating-point format). The codec design is emphasised, as it constitutes the primary distinguishing feature compared to logarithmic posits (LNS) and posits (floating-point format), which otherwise share similar internal representations. Furthermore, a novel internal representation for LNS is proposed. The presented takum codec, implemented in VHDL, demonstrates near-optimal scalability and performance on an FPGA. It achieves latency reductions of up to 38% and reduces LUT utilisation up to 50% compared to the best state-of-the-art posit codecs.

Design and Implementation of a Takum Arithmetic Hardware Codec

TL;DR

The paper tackles the limitations of existing floating-point and posit formats by introducing a hardware Takum codec for both logarithmic Takums (LNS) and linear Takums, underpinned by a novel internal LNS representation. It provides an efficient, open-source VHDL implementation optimized for FPGA, and a detailed encoder/decoder architecture leveraging a bounded exponent and compact preprocessing to achieve strong hardware efficiency. Empirical results on a Kintex UltraScale+ FPGA show Takum decoders outperform state-of-the-art posit codecs by up to 38% in latency and up to 50% in LUTs, while encoders reach up to 13% lower latency with similar resource use. The work suggests Takums offer practical benefits for mixed-precision numerical computing, with clear directions for VLSI and full-APU integration as future work, including quire considerations and exploration of the chosen base in the logarithmic form.

Abstract

The takum machine number format has been recently proposed as an enhancement over the posit number format, which is considered a promising alternative to the IEEE 754 floating-point standard. Takums retain the useful posit properties, but feature a novel exponent coding scheme that yields more precision for small and large magnitude numbers and a much higher and bounded dynamic range. This paper presents the design and implementation of a hardware codec for both takums (logarithmic number system, LNS) and linear takums (floating-point format). The codec design is emphasised, as it constitutes the primary distinguishing feature compared to logarithmic posits (LNS) and posits (floating-point format), which otherwise share similar internal representations. Furthermore, a novel internal representation for LNS is proposed. The presented takum codec, implemented in VHDL, demonstrates near-optimal scalability and performance on an FPGA. It achieves latency reductions of up to 38% and reduces LUT utilisation up to 50% compared to the best state-of-the-art posit codecs.
Paper Structure (17 sections, 3 theorems, 16 equations, 4 figures, 2 tables)

This paper contains 17 sections, 3 theorems, 16 equations, 4 figures, 2 tables.

Key Result

proposition 1

Let $n \in \mathbb{N}_1$ and bit string $T := (\textcolor{sign}{S},\textcolor{direction}{D},\textcolor{regime}{R}, \textcolor{characteristic}{C},\textcolor{mantissa}{M}) \in {\{0,1\}}^n$ as in Definition def:takum with $\mathop{\mathrm{\tau}}\nolimits((\textcolor{sign}{S},\textcolor{direction}{D},

Figures (4)

  • Figure 1: The logic circuit of the predecoder, largely separated into three main entities: the regime/antiregime determinator (E1), the characteristic/exponent determinator (E2) and the special case detector (E3). We assume $n \ge 12$ (thus omitting optional zero-expansion of $\mathit{takum}$ at the beginning for $n < 12$) for simplicity; the implemented predecoder works for any $n \ge 2$. We also assume an enabled $\mathit{output\_exponent}$, as disabling it would only flip the top MUX in E2. Vertical dashed lines indicate where the strands of a multi-signal are split up or combined.
  • Figure 2: The logic circuit of the postencoder, largely separated into five main entities: the underflow/overflow predictor (E1), the characteristic precursor determinator (E2), the extended takum generator (E3), the rounder (E4) and the output driver (E5). We assume $n \ge 12$ (thus omitting special case handling in the underflow/overflow predictor for $n < 12$) for simplicity; the implemented postencoder works for any $n \ge 2$. Vertical dashed lines indicate where the strands of a multi-signal are split up or combined.
  • Figure 3: Evaluation results for the decoder in terms of latency and LUT consumption.
  • Figure 4: Evaluation results for the encoder in terms of latency and LUT consumption.

Theorems & Definitions (8)

  • definition 1: takum encoding 2024-takum
  • definition 2: linear takum encoding 2024-takum
  • proposition 1: characteristic complement
  • proof
  • corollary 1: conditional characteristic complement
  • proof
  • proposition 2: characteristic precursor
  • proof