Generating Bindings in MPICH
Hui Zhou, Ken Raffenetti, Wesley Bland, Yanfei Guo
TL;DR
The paper addresses the large maintenance burden from duplicative MPICH binding code and MPI interface definitions by leveraging a Python-driven toolbox that consumes the MPI Forum's semantic API. It demonstrates how semantic, language-neutral API descriptions enable automated generation of C and Fortran 2008 bindings, dramatically reducing manual code (approximately 70,000 lines replaced by ~5,000 Python lines) and easing extension to MPI 4 features such as large count functions and QMPI prototyping. The approach also supports MPIX extensions with minimal configuration and extends Fortran bindings via the F08 interface, made feasible by richer semantic data. While successful, the work notes reliance on a derived API input rather than upstream sources and calls for richer upstream semantics and templating to improve maintainability and future evolution.
Abstract
The MPI Forum has recently adopted a Python scripting engine for generating the API text in the standard document. As a by-product, it made available reliable and rich descriptions of all MPI functions that are suited for scripting tools. Using these extracted API information, we developed a Python code generation toolbox to generate the language binding layers in MPICH. The toolbox replaces nearly 70,000 lines of manually maintained C and Fortran 2008 binding code with around 5,000 lines of Python scripts plus some simple configuration. In addition to completely eliminating code duplication in the binding layer and avoiding bugs from manual code copying , the code generation also minimizes the effort for API extension and code instrumentation. This is demonstrated in our implementation of MPI-4 large count functions and the prototyping of a next generation MPI profiling interface, QMPI.
