Table of Contents
Fetching ...

HMTRace: Hardware-Assisted Memory-Tagging based Dynamic Data Race Detection

Jaidev Shastri, Xiaoguang Wang, Basavesh Ammanaghatta Shivakumar, Freek Verbeek, Binoy Ravindran

TL;DR

This work addresses the persistent challenge of data races in multi-threaded programs by introducing HMTRace, a hardware-assisted dynamic race detector that leverages Armv8.5-A Memory Tagging Extension (MTE) to enforce memory access constraints with minimal overhead. HMTRace combines lockset analysis with structured MTE-based tagging (Tag-based race inference, TBRI) to detect races in OpenMP- and Pthread-based C applications, providing on-the-fly reporting and system-wide tag reuse. Empirical results show HMTRace achieving an approximate F1-score around 0.86–0.88 with zero false positives, while incurring substantially lower execution time overhead (~4.01%) and moderate peak RSS growth (~54.31%) compared to state-of-the-art dynamic detectors like Thread sanitizer and Archer. The approach demonstrates practical potential for production environments, enabling high-precision race detection with reduced instrumentation burden, and identifies new races in large-scale applications such as Redis and jemalloc, underscoring its relevance for real-world software reliability.

Abstract

Data race, a category of insidious software concurrency bugs, is often challenging and resource-intensive to detect and debug. Existing dynamic race detection tools incur significant execution time and memory overhead while exhibiting high false positives. This paper proposes HMTRace, a novel Armv8.5-A memory tag extension (MTE) based dynamic data race detection framework, emphasizing low compute and memory requirements while maintaining high accuracy and precision. HMTRace supports race detection in userspace OpenMP- and Pthread-based multi-threaded C applications. HMTRace showcases a combined f1-score of 0.86 while incurring a mean execution time overhead of 4.01% and peak memory (RSS) overhead of 54.31%. HMTRace also does not report false positives, asserting all reported races.

HMTRace: Hardware-Assisted Memory-Tagging based Dynamic Data Race Detection

TL;DR

This work addresses the persistent challenge of data races in multi-threaded programs by introducing HMTRace, a hardware-assisted dynamic race detector that leverages Armv8.5-A Memory Tagging Extension (MTE) to enforce memory access constraints with minimal overhead. HMTRace combines lockset analysis with structured MTE-based tagging (Tag-based race inference, TBRI) to detect races in OpenMP- and Pthread-based C applications, providing on-the-fly reporting and system-wide tag reuse. Empirical results show HMTRace achieving an approximate F1-score around 0.86–0.88 with zero false positives, while incurring substantially lower execution time overhead (~4.01%) and moderate peak RSS growth (~54.31%) compared to state-of-the-art dynamic detectors like Thread sanitizer and Archer. The approach demonstrates practical potential for production environments, enabling high-precision race detection with reduced instrumentation burden, and identifies new races in large-scale applications such as Redis and jemalloc, underscoring its relevance for real-world software reliability.

Abstract

Data race, a category of insidious software concurrency bugs, is often challenging and resource-intensive to detect and debug. Existing dynamic race detection tools incur significant execution time and memory overhead while exhibiting high false positives. This paper proposes HMTRace, a novel Armv8.5-A memory tag extension (MTE) based dynamic data race detection framework, emphasizing low compute and memory requirements while maintaining high accuracy and precision. HMTRace supports race detection in userspace OpenMP- and Pthread-based multi-threaded C applications. HMTRace showcases a combined f1-score of 0.86 while incurring a mean execution time overhead of 4.01% and peak memory (RSS) overhead of 54.31%. HMTRace also does not report false positives, asserting all reported races.
Paper Structure (52 sections, 9 equations, 6 figures, 2 tables, 2 algorithms)

This paper contains 52 sections, 9 equations, 6 figures, 2 tables, 2 algorithms.

Figures (6)

  • Figure 1: Visual representation of shadow cell used in Thread sanitizer and Archer for shared memory access tracking. Every potentially shared 8 byte allocation access is mapped to one of the four shadow values.
  • Figure 2: Overview of hardware based memory tagging extension (MTE).
  • Figure 3: Overview of HMTRace framework. Blocks highlighted in gray signify unmodified modules.
  • Figure 4: Comparison of HB + LS with TBRI. A check signifies MTE tag match (no race), and an exclamation signifies MTE tag mismatch (race). $RD/WR$: Read/write events to the same pointee granule. $LA(x)/LR(x)$: Lock acquision/release for lock $x$. $\tau_{a}, \tau_{b}$: Thread IDs representing two threads.
  • Figure 5: Precision and accuracy of the dynamic race detection tools. The F1-score, representing the weighted average of precision and accuracy, is listed within parentheses.
  • ...and 1 more figures