Float Self-Tagging

Olivier Melançon; Manuel Serrano; Marc Feeley

Float Self-Tagging

Olivier Melançon, Manuel Serrano, Marc Feeley

TL;DR

Self-tagging introduces an invertible bitwise transformation to overlay type information on a subset of IEEE 754 doubles, enabling many floats to be represented as tagged values and avoiding heap allocations on those floats. By carefully selecting rotation and exponent-bit-based tagging, the method achieves substantial memory savings with modest or context-dependent performance changes compared to NaN-boxing and NuN-boxing. The approach is evaluated in two Scheme compilers (Bigloo and Gambit) across four architectures, showing near-elimination of heap-allocated floats for common cases and competitive overall performance, especially in memory-heavy workloads. Its portability to 32-bit systems, simple integration into existing runtimes, and data-driven trade-offs across workloads make self-tagging a promising alternative for efficiently implementing floating-point numbers in dynamic languages.

Abstract

Dynamic and polymorphic languages attach information, such as types, to run time objects, and therefore adapt the memory layout of values to include space for this information. This makes it difficult to efficiently implement IEEE754 floating-point numbers as this format does not leave an easily accessible space to store type information. The three main floating-point number encodings in use today, tagged pointers, NaN-boxing, and NuN-boxing, have drawbacks. Tagged pointers entail a heap allocation of all float objects, and NaN/NuN-boxing puts additional run time costs on type checks and the handling of other objects. This paper introduces self-tagging, a new approach to object tagging that uses an invertible bitwise transformation to map floating-point numbers to tagged values that contain the correct type information at the correct position in their bit pattern, superimposing both their value and type information in a single machine word. Such a transformation can only map a subset of all floats to correctly typed tagged values, hence self-tagging takes advantage of the non-uniform distribution of floating point numbers used in practice to avoid heap allocation of the most frequently encountered floats. Variants of self-tagging were implemented in two distinct Scheme compilers and evaluated on four microarchitectures to assess their performance and compare them to tagged pointers, NaN-boxing, and NuN-boxing. Experiments demonstrate that, in practice, the approach eliminates heap allocation of nearly all floating-point numbers and provides good execution speed of float-intensive benchmarks in Scheme with a negligible performance impact on other benchmarks, making it an attractive alternative to tagged pointers, alongside NaN-boxing and NuN-boxing.

Float Self-Tagging

TL;DR

Abstract

Float Self-Tagging

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (15)