Advances in Semantic Patching for HPC-oriented Refactorings with Coccinelle
Michele Martone, Julia Lawall
TL;DR
The paper addresses the challenge of porting CPU-based HPC codes to GPUs by proposing a workflow that uses Coccinelle and semantic patches (SmPL) to express HPC-oriented refactorings as separate, testable rules. This rule-based approach enables transformations such as instrumentation insertion, architecture variants, data-layout rewrites, and API translations while preserving code readability and debuggability. By contrasting SmPL with alternative tooling (e.g., LLVM/ROSE) and ML-based code assistants, the work showcases practical, replayable refactorings that can be applied incrementally and reversibly. The contributions provide a principled path for large-scale, performance-portable HPC code evolution, with implications for maintainability and collaboration across domain scientists and performance engineers.
Abstract
Currently, the most energy-efficient hardware platforms for floating point-intensive calculations (also known as High Performance Computing, or HPC) are graphical processing units (GPUs). However, porting existing scientific codes to GPUs can be far from trivial. This article summarizes our recent advances in enabling machine-assisted, HPC-oriented refactorings with reference to existing APIs and programming idioms available in C and C++. The tool we are extending and using for the purpose is called Coccinelle. An important workflow we aim to support is that of writing and maintaining tersely written application code, while deferring circumstantial, ad-hoc, performance-related changes to specific, separate rules called semantic patches. GPUs currently offer very limited debugging facilities. The approach we are developing aims at preserving intelligibility, longevity, and relatedly, debuggability of existing code on CPUs, while at the same time enabling HPC-oriented code evolutions such as introducing support for GPUs, in a scriptable and possibly parametric manner. This article sketches a number of self-contained use cases, including further HPC-oriented cases which are independent from GPUs.
