Table of Contents
Fetching ...

MPIrigen: MPI Code Generation through Domain-Specific Language Models

Nadav Schneider, Niranjan Hasabnis, Vy A. Vo, Tal Kadosh, Neva Krien, Mihai Capotă, Guy Tamir, Ted Willke, Nesreen Ahmed, Yuval Pinter, Timothy Mattson, Gal Oren

TL;DR

The paper addresses the challenge of generating MPI-based parallel code, a problem not well solved by general-purpose LLMs. It demonstrates that domain-specific fine-tuning on MPI-C/C++ code yields superior MPI function insertion and argument generation, particularly when semantic information is controlled via TokomPiler preprocessing. By introducing HPCorpusMPI and MPIrigen (fine-tuned MonoCoder), the authors show strong MPI code generation performance (locations and functions up to $0.80$ and arguments up to $0.95$ under a variance of $2$), outperforming GPT-3.5 zero-shot. This work highlights the value of domain adaptation for parallel computing code and points toward practical automatic parallelization tools. The HPCorpusMPI dataset and MPIrigen pipeline provide a foundation for further advances in HPC-focused code generation and verification.

Abstract

The imperative need to scale computation across numerous nodes highlights the significance of efficient parallel computing, particularly in the realm of Message Passing Interface (MPI) integration. The challenging parallel programming task of generating MPI-based parallel programs has remained unexplored. This study first investigates the performance of state-of-the-art language models in generating MPI-based parallel programs. Findings reveal that widely used models such as GPT-3.5 and PolyCoder (specialized multi-lingual code models) exhibit notable performance degradation, when generating MPI-based programs compared to general-purpose programs. In contrast, domain-specific models such as MonoCoder, which are pretrained on MPI-related programming languages of C and C++, outperform larger models. Subsequently, we introduce a dedicated downstream task of MPI-based program generation by fine-tuning MonoCoder on HPCorpusMPI. We call the resulting model as MPIrigen. We propose an innovative preprocessing for completion only after observing the whole code, thus enabling better completion with a wider context. Comparative analysis against GPT-3.5 zero-shot performance, using a novel HPC-oriented evaluation method, demonstrates that MPIrigen excels in generating accurate MPI functions up to 0.8 accuracy in location and function predictions, and with more than 0.9 accuracy for argument predictions. The success of this tailored solution underscores the importance of domain-specific fine-tuning in optimizing language models for parallel computing code generation, paving the way for a new generation of automatic parallelization tools. The sources of this work are available at our GitHub MPIrigen repository: https://github.com/Scientific-Computing-Lab-NRCN/MPI-rigen

MPIrigen: MPI Code Generation through Domain-Specific Language Models

TL;DR

The paper addresses the challenge of generating MPI-based parallel code, a problem not well solved by general-purpose LLMs. It demonstrates that domain-specific fine-tuning on MPI-C/C++ code yields superior MPI function insertion and argument generation, particularly when semantic information is controlled via TokomPiler preprocessing. By introducing HPCorpusMPI and MPIrigen (fine-tuned MonoCoder), the authors show strong MPI code generation performance (locations and functions up to and arguments up to under a variance of ), outperforming GPT-3.5 zero-shot. This work highlights the value of domain adaptation for parallel computing code and points toward practical automatic parallelization tools. The HPCorpusMPI dataset and MPIrigen pipeline provide a foundation for further advances in HPC-focused code generation and verification.

Abstract

The imperative need to scale computation across numerous nodes highlights the significance of efficient parallel computing, particularly in the realm of Message Passing Interface (MPI) integration. The challenging parallel programming task of generating MPI-based parallel programs has remained unexplored. This study first investigates the performance of state-of-the-art language models in generating MPI-based parallel programs. Findings reveal that widely used models such as GPT-3.5 and PolyCoder (specialized multi-lingual code models) exhibit notable performance degradation, when generating MPI-based programs compared to general-purpose programs. In contrast, domain-specific models such as MonoCoder, which are pretrained on MPI-related programming languages of C and C++, outperform larger models. Subsequently, we introduce a dedicated downstream task of MPI-based program generation by fine-tuning MonoCoder on HPCorpusMPI. We call the resulting model as MPIrigen. We propose an innovative preprocessing for completion only after observing the whole code, thus enabling better completion with a wider context. Comparative analysis against GPT-3.5 zero-shot performance, using a novel HPC-oriented evaluation method, demonstrates that MPIrigen excels in generating accurate MPI functions up to 0.8 accuracy in location and function predictions, and with more than 0.9 accuracy for argument predictions. The success of this tailored solution underscores the importance of domain-specific fine-tuning in optimizing language models for parallel computing code generation, paving the way for a new generation of automatic parallelization tools. The sources of this work are available at our GitHub MPIrigen repository: https://github.com/Scientific-Computing-Lab-NRCN/MPI-rigen
Paper Structure (6 sections, 5 figures, 1 table)

This paper contains 6 sections, 5 figures, 1 table.

Figures (5)

  • Figure 1: The MPI functions in the source code (a) are removed and concatenated with their corresponding line number to the last line (3). This way, MPIrigen learns in a left-to-right fashion the relation between code and its appropriate MPI functions. The TokomPiler version in (2), embeds AST information while neglecting any semantic information. TokomPiler version is used to demonstrate MonoCoder's semantic information independence during generation compared to other models.
  • Figure 2: Performance of various models on Code Completion task over the HPCorpusMPI. Models predict code continuation starting from token 100, 300, and 600.
  • Figure 3: Performance breakdown of MPIrigen (fine-tuned MonoCoder-0.7B over 16K MPI codes from HPCorpusMPI) over programs containing (n) or less of MPI function calls (X axis). Y axis is accuracy obtained for such programs. Note shifted scales in sub-figure b and c.
  • Figure 4: Performance breakdown of GPT3.5 using prompt "Generate the optimal MPI functions for the provided code, and supply in the response the entire complete code with those MPI functions: [CODE]". X axis represents programs containing n or less of MPI function calls. Y axis is accuracy obtained for such programs. Note shifted scales in sub-figure b and c.
  • Figure 5: Stacked bar chart of the ground truth and MPIrigen prediction distribution of selected MPI functions under variance 2 (correct location and function predictions are presented). X axis represents programs containing n or less of MPI function calls.