Table of Contents
Fetching ...

Verifying a Realistic Mutable Hash Table

Samuel Chassot, Viktor Kunčak

TL;DR

The paper tackles the challenge of formally verifying a realistic mutable hash table in Scala, focusing on the mutable LongMap with open addressing and quadratic probing. It ports the Scala implementation to Stainless, uses tail-recursive structures and a decorator pattern to enable modular proofs, and verifies the mutable map against an immutable ListLongMap specification, aided by a ghost method that relates the two representations. A bug in the original repack/newMask computation is discovered for large capacities and fixed, with the verified version achieving roughly a 1.5x slowdown in lookups compared to the unverified implementation. The work demonstrates the feasibility of verifying mutable, in-place data structures in Stainless and provides architectural patterns (swap-based unique references and decorator proofs) to support verification of complex imperative Scala code, informing future verification of similar data structures.

Abstract

In this work, we verify the mutable LongMap from the Scala standard library, a hash table using open addressing within a single array, using the Stainless program verifier. As a reference implementation, we write an immutable map based on a list of tuples. We then show that LongMap's operations correspond to operations of this association list. To express the resizing of the hash table array, we introduce a new reference swapping construct in Stainless. This allows us to apply the decorator pattern without introducing aliasing. Our verification effort led us to find and fix a bug in the original implementation that manifests for large hash tables. Our performance analysis shows the verified version to be within a 1.5 factor of the original data structure.

Verifying a Realistic Mutable Hash Table

TL;DR

The paper tackles the challenge of formally verifying a realistic mutable hash table in Scala, focusing on the mutable LongMap with open addressing and quadratic probing. It ports the Scala implementation to Stainless, uses tail-recursive structures and a decorator pattern to enable modular proofs, and verifies the mutable map against an immutable ListLongMap specification, aided by a ghost method that relates the two representations. A bug in the original repack/newMask computation is discovered for large capacities and fixed, with the verified version achieving roughly a 1.5x slowdown in lookups compared to the unverified implementation. The work demonstrates the feasibility of verifying mutable, in-place data structures in Stainless and provides architectural patterns (swap-based unique references and decorator proofs) to support verification of complex imperative Scala code, informing future verification of similar data structures.

Abstract

In this work, we verify the mutable LongMap from the Scala standard library, a hash table using open addressing within a single array, using the Stainless program verifier. As a reference implementation, we write an immutable map based on a list of tuples. We then show that LongMap's operations correspond to operations of this association list. To express the resizing of the hash table array, we introduce a new reference swapping construct in Stainless. This allows us to apply the decorator pattern without introducing aliasing. Our verification effort led us to find and fix a bug in the original implementation that manifests for large hash tables. Our performance analysis shows the verified version to be within a 1.5 factor of the original data structure.

Paper Structure

This paper contains 19 sections, 7 figures, 1 table.

Figures (7)

  • Figure 1: Mutable LongMap interface and specification (note that we omit preconditions in this figure : they are only valid invariant checks, if any).
  • Figure 2: Lookup of N keys in a map prepopulated with $2^{22}$ pairs (left) (time normalized per operation) and insertion of $2^{22}$ pairs starting with an initial capacity of 16 followed by lookup of N keys (right). Horizontal lines show the average. The black vertical lines show $2^{22}$. The error bars show the 95% CI. The time on y-axis is normalized with respect to the original map.
  • Figure 3: Lookup of N keys in a prepopulated map with $2^{15}$ pairs (left) (time normalized per operation) and $2^{22}$ pairs (right) (time normalized per operation). Horizontal lines show the average. The black vertical line shows $2^{15}$ (left) and $2^{22}$ (right). The error bars show the 95% CI. The time on y-axis is normalized with respect to the original map.
  • Figure 4: Insertion of $2^{15}$ pairs (left) and $2^{22}$ pairs (right) starting with an initial capacity of 16 followed by lookup of N keys. The black vertical line shows $2^{15}$ (left) and $2^{22}$ (right). The error bars show the 95% CI. The time on y-axis is normalized with respect to the original map.
  • Figure 5: Insertion of $2^{15}$ pairs (left) and $2^{22}$ pairs (right) starting with an initial capacity of 16 followed by lookup of N keys. During the insertion process, all pairs are inserted, then one half is removed, then all pairs are inserted again. The black vertical line shows $2^{15}$ (left) and $2^{22}$ (right). The error bars show the 95% CI. The time on y-axis is normalized with respect to the original map.
  • ...and 2 more figures