Table of Contents
Fetching ...

MPI Errors Detection using GNN Embedding and Vector Embedding over LLVM IR

Jad El Karchi, Hanze Chen, Ali TehraniJamsaz, Ali Jannesari, Mihail Popov, Emmanuelle Saillard

TL;DR

This paper is the first to utilize embedding and deep learning graph neural networks (GNNs) to tackle the issue of identifying bugs in MPI programs, designed and developed two models that can determine, from a code’s LLVM Intermediate Representation (IR), whether the code is correct or contains a known MPI error.

Abstract

Identifying errors in parallel MPI programs is a challenging task. Despite the growing number of verification tools, debugging parallel programs remains a significant challenge. This paper is the first to utilize embedding and deep learning graph neural networks (GNNs) to tackle the issue of identifying bugs in MPI programs. Specifically, we have designed and developed two models that can determine, from a code's LLVM Intermediate Representation (IR), whether the code is correct or contains a known MPI error. We tested our models using two dedicated MPI benchmark suites for verification: MBI and MPI-CorrBench. By training and validating our models on the same benchmark suite, we achieved a prediction accuracy of 92% in detecting error types. Additionally, we trained and evaluated our models on distinct benchmark suites (e.g., transitioning from MBI to MPI-CorrBench) and achieved a promising accuracy of over 80%. Finally, we investigated the interaction between different MPI errors and quantified our models' generalization capabilities over new unseen errors. This involved removing error types during training and assessing whether our models could still predict them. The detection accuracy of removed errors varies significantly between 20% to 80%, indicating connected error patterns.

MPI Errors Detection using GNN Embedding and Vector Embedding over LLVM IR

TL;DR

This paper is the first to utilize embedding and deep learning graph neural networks (GNNs) to tackle the issue of identifying bugs in MPI programs, designed and developed two models that can determine, from a code’s LLVM Intermediate Representation (IR), whether the code is correct or contains a known MPI error.

Abstract

Identifying errors in parallel MPI programs is a challenging task. Despite the growing number of verification tools, debugging parallel programs remains a significant challenge. This paper is the first to utilize embedding and deep learning graph neural networks (GNNs) to tackle the issue of identifying bugs in MPI programs. Specifically, we have designed and developed two models that can determine, from a code's LLVM Intermediate Representation (IR), whether the code is correct or contains a known MPI error. We tested our models using two dedicated MPI benchmark suites for verification: MBI and MPI-CorrBench. By training and validating our models on the same benchmark suite, we achieved a prediction accuracy of 92% in detecting error types. Additionally, we trained and evaluated our models on distinct benchmark suites (e.g., transitioning from MBI to MPI-CorrBench) and achieved a promising accuracy of over 80%. Finally, we investigated the interaction between different MPI errors and quantified our models' generalization capabilities over new unseen errors. This involved removing error types during training and assessing whether our models could still predict them. The detection accuracy of removed errors varies significantly between 20% to 80%, indicating connected error patterns.
Paper Structure (18 sections, 9 figures, 6 tables)

This paper contains 18 sections, 9 figures, 6 tables.

Figures (9)

  • Figure 1: Number of codes per error type in MPI-CorrBench (left) and MBI (right).
  • Figure 2: Code size in MPI-CorrBench (left) and MBI (right). The line of code is reported after performing the C pre-processing include calls. MPI-CorrBench correct codes have a high line count compared to the incorrect codes. On the opposite, MBI has no significant outlier in the line count.
  • Figure 3: Number of correct and incorrect codes in MBI and MPI-CorrBench.
  • Figure 4: Predicting errors in MPI applications with embedding based models.
  • Figure 5: Graph Neural Network based model to predict errors in MPI.
  • ...and 4 more figures