Table of Contents
Fetching ...

DR.FIX: Automatically Fixing Data Races at Industry Scale

Farnaz Behrang, Zhizhou Zhang, Georgian-Vlad Saioc, Peng Liu, Milind Chabbi

TL;DR

This work tackles automatic data-race fixing at industrial scale by marrying program analysis with large language models in Dr.Fix, a Go-focused system deployed at Uber. The approach uses retrieval-augmented generation guided by code skeleton abstractions and a vector database of past fixes to generate idiomatic patches, validated through rigorous testing and developer reviews. Empirical results show Dr.Fix fixed 55% of detected races (224/404) with an 86% acceptance rate (193 patches), and reduced average ticket closure time from 11 days to about 3 days, demonstrating substantial practical impact. The work highlights the potential and limitations of GenAI-guided concurrency repair, discusses generality to other languages, and outlines directions for improving skeleton abstractions, multi-fix generation, and broader tool integration.

Abstract

Data races are a prevalent class of concurrency bugs in shared-memory parallel programs, posing significant challenges to software reliability and reproducibility. While there is an extensive body of research on detecting data races and a wealth of practical detection tools across various programming languages, considerably less effort has been directed toward automatically fixing data races at an industrial scale. In large codebases, data races are continuously introduced and exhibit myriad patterns, making automated fixing particularly challenging. In this paper, we tackle the problem of automatically fixing data races at an industrial scale. We present Dr.Fix, a tool that combines large language models (LLMs) with program analysis to generate fixes for data races in real-world settings, effectively addressing a broad spectrum of racy patterns in complex code contexts. Implemented for Go--the programming language widely used in modern microservice architectures where concurrency is pervasive and data races are common--Dr.Fix seamlessly integrates into existing development workflows. We detail the design of Dr.Fix and examine how individual design choices influence the quality of the fixes produced. Over the past 18 months, Dr.Fix has been integrated into developer workflows at Uber demonstrating its practical utility. During this period, Dr.Fix produced patches for 224 (55%) from a corpus of 404 data races spanning various categories; 193 of these patches (86%) were accepted by more than a hundred developers via code reviews and integrated into the codebase.

DR.FIX: Automatically Fixing Data Races at Industry Scale

TL;DR

This work tackles automatic data-race fixing at industrial scale by marrying program analysis with large language models in Dr.Fix, a Go-focused system deployed at Uber. The approach uses retrieval-augmented generation guided by code skeleton abstractions and a vector database of past fixes to generate idiomatic patches, validated through rigorous testing and developer reviews. Empirical results show Dr.Fix fixed 55% of detected races (224/404) with an 86% acceptance rate (193 patches), and reduced average ticket closure time from 11 days to about 3 days, demonstrating substantial practical impact. The work highlights the potential and limitations of GenAI-guided concurrency repair, discusses generality to other languages, and outlines directions for improving skeleton abstractions, multi-fix generation, and broader tool integration.

Abstract

Data races are a prevalent class of concurrency bugs in shared-memory parallel programs, posing significant challenges to software reliability and reproducibility. While there is an extensive body of research on detecting data races and a wealth of practical detection tools across various programming languages, considerably less effort has been directed toward automatically fixing data races at an industrial scale. In large codebases, data races are continuously introduced and exhibit myriad patterns, making automated fixing particularly challenging. In this paper, we tackle the problem of automatically fixing data races at an industrial scale. We present Dr.Fix, a tool that combines large language models (LLMs) with program analysis to generate fixes for data races in real-world settings, effectively addressing a broad spectrum of racy patterns in complex code contexts. Implemented for Go--the programming language widely used in modern microservice architectures where concurrency is pervasive and data races are common--Dr.Fix seamlessly integrates into existing development workflows. We detail the design of Dr.Fix and examine how individual design choices influence the quality of the fixes produced. Over the past 18 months, Dr.Fix has been integrated into developer workflows at Uber demonstrating its practical utility. During this period, Dr.Fix produced patches for 224 (55%) from a corpus of 404 data races spanning various categories; 193 of these patches (86%) were accepted by more than a hundred developers via code reviews and integrated into the codebase.

Paper Structure

This paper contains 59 sections, 2 equations, 4 figures, 7 tables.

Figures (4)

  • Figure 1: Dr.Fix schematic diagram.
  • Figure 2: Contextual information of a race report with two racing goroutines, Child1 and Child2. The call paths of Child1 and Child2 at the time of the race are $P\rightarrow Q \rightarrow R$ and $I\rightarrow J \rightarrow K$, respectively. Child1 was created by Parent1 at the call path $A\rightarrow B \rightarrow C$; Child2 was created by Parent2 at call path $A\rightarrow B \rightarrow D$. Each node in the call graph carries its function name and source location. Racy accesses have read/write annotations.
  • Figure 3: Comparison of how using examples and selecting them via concurrency skeleton impacts the number of successful fixes.
  • Figure 4: Comparison of how fixing at different scopes and using the diagnostics from previous failures improves the success rate.