Data Race Detection by Digest-Driven Abstract Interpretation (Extended Version)
Michael Schwarz, Julian Erhard
TL;DR
Data race detection remains challenging for static analysis due to the need to reason about time separation of memory accesses across threads. The authors repurpose digests—summaries of computational histories—within a thread-modular abstract interpretation framework based on the local trace semantics to express exclusion criteria for potential conflicts as a predicate $||^{?}_g$. They formalize two data race definitions, derive a sound predicate for admissible, access-stable digests, and present a digest-driven race-detection algorithm implemented in Goblint. The evaluation on the SV-COMP data-race benchmark shows that combining lockset digests with thread-id and join digests yields substantial improvements: more than a fivefold increase in correctly solved race-free tasks over lockset alone, and strong overall performance with minimal false results. The work provides a modular, extensible foundation for principled static race detection using digests.
Abstract
Sound static analysis can prove the absence of data races by establishing that no two conflicting memory accesses can occur at the same time. We repurpose the concept of digests -- summaries of computational histories originally introduced to bring tunable concurrency-sensitivity to thread-modular value analysis by abstract interpretation, extending this idea to race detection: We use digests to capture the conditions under which conflicting accesses may not happen in parallel. To formalize this, we give a definition of data races in the thread-modular local trace semantics and show how exclusion criteria for potential conflicts can be expressed as digests. We report on our implementation of digest-driven data race detection in the static analyzer Goblint, and evaluate it on the SV-COMP benchmark suite. Combining the lockset digest with digests reasoning on thread ids and thread joins increases the number of correctly solved tasks by more than a factor of five compared to lockset reasoning alone.
