In industrial embedded software, are some compilation errors easier to localize and fix than others?
Han Fu, Sigrid Eldh, Kristian Wiklund, Andreas Ermedahl, Philipp Haller, Cyrille Artho
TL;DR
Industrial embedded CI suffers frequent compilation failures due to hardware–software mismatch. The authors build Shadow Job, a parallel CI diagnostic, to analyze over 40,000 builds, classify 14 error types into five classes, and measure resolution time, fix size, and fix distance. They find that dependency errors dominate (about 76%) and that the top five error types cover roughly 89% of failures, with resolution time often increasing with error frequency but fix size remaining small and distance variable. These results indicate strong potential for automated fault localization and limited but actionable opportunities for automatic program repair focused on the most common industrial errors.
Abstract
Industrial embedded systems often require specialized hardware. However, software engineers have access to such domain-specific hardware only at the continuous integration (CI) stage and have to use simulated hardware otherwise. This results in a higher proportion of compilation errors at the CI stage than in other types of systems, warranting a deeper study. To this end, we create a CI diagnostics solution called ``Shadow Job'' that analyzes our industrial CI system. We collected over 40000 builds from 4 projects from the product source code and categorized the compilation errors into 14 error types, showing that the five most common ones comprise 89 % of all compilation errors. Additionally, we analyze the resolution time, size, and distance for each error type, to see if different types of compilation errors are easier to localize or repair than others. Our results show that the resolution time, size, and distance are independent of each other. Our research also provides insights into the human effort required to fix the most common industrial compilation errors. We also identify the most promising directions for future research on fault localization.
