DR.FIX: Automatically Fixing Data Races at Industry Scale
Farnaz Behrang, Zhizhou Zhang, Georgian-Vlad Saioc, Peng Liu, Milind Chabbi
TL;DR
This work tackles automatic data-race fixing at industrial scale by marrying program analysis with large language models in Dr.Fix, a Go-focused system deployed at Uber. The approach uses retrieval-augmented generation guided by code skeleton abstractions and a vector database of past fixes to generate idiomatic patches, validated through rigorous testing and developer reviews. Empirical results show Dr.Fix fixed 55% of detected races (224/404) with an 86% acceptance rate (193 patches), and reduced average ticket closure time from 11 days to about 3 days, demonstrating substantial practical impact. The work highlights the potential and limitations of GenAI-guided concurrency repair, discusses generality to other languages, and outlines directions for improving skeleton abstractions, multi-fix generation, and broader tool integration.
Abstract
Data races are a prevalent class of concurrency bugs in shared-memory parallel programs, posing significant challenges to software reliability and reproducibility. While there is an extensive body of research on detecting data races and a wealth of practical detection tools across various programming languages, considerably less effort has been directed toward automatically fixing data races at an industrial scale. In large codebases, data races are continuously introduced and exhibit myriad patterns, making automated fixing particularly challenging. In this paper, we tackle the problem of automatically fixing data races at an industrial scale. We present Dr.Fix, a tool that combines large language models (LLMs) with program analysis to generate fixes for data races in real-world settings, effectively addressing a broad spectrum of racy patterns in complex code contexts. Implemented for Go--the programming language widely used in modern microservice architectures where concurrency is pervasive and data races are common--Dr.Fix seamlessly integrates into existing development workflows. We detail the design of Dr.Fix and examine how individual design choices influence the quality of the fixes produced. Over the past 18 months, Dr.Fix has been integrated into developer workflows at Uber demonstrating its practical utility. During this period, Dr.Fix produced patches for 224 (55%) from a corpus of 404 data races spanning various categories; 193 of these patches (86%) were accepted by more than a hundred developers via code reviews and integrated into the codebase.
