Table of Contents
Fetching ...

Discovering Software Parallelization Points Using Deep Neural Networks

Izavan dos S. Correia, Henrique C. T. Santos, Tiago A. E. Ferreira

TL;DR

This work tackles automated detection of parallelizable loops in code using deep learning. A synthetic dataset of 4000 Python samples, generated via DEAP genetic algorithms, is used to train DNN and CNN classifiers on tokenized code sequences, with PCA employed for dimensionality reduction. Across 30 independent runs, CNN shows marginally better average performance and superior best-case accuracy (up to 97.67%), with both models displaying robust generalization under varying data retention levels, though worst-case runs reveal notable instability. The study demonstrates feasibility and potential practicality of DL-based parallelization assessment, offering a resource-efficient alternative to RL-based approaches and setting the stage for extensions to real-world code and transformer-based architectures.

Abstract

This study proposes a deep learning-based approach for discovering loops in programming code according to their potential for parallelization. Two genetic algorithm-based code generators were developed to produce two distinct types of code: (i) independent loops, which are parallelizable, and (ii) ambiguous loops, whose dependencies are unclear, making them impossible to define if the loop is parallelizable or not. The generated code snippets were tokenized and preprocessed to ensure a robust dataset. Two deep learning models - a Deep Neural Network (DNN) and a Convolutional Neural Network (CNN) - were implemented to perform the classification. Based on 30 independent runs, a robust statistical analysis was employed to verify the expected performance of both models, DNN and CNN. The CNN showed a slightly higher mean performance, but the two models had a similar variability. Experiments with varying dataset sizes highlighted the importance of data diversity for model performance. These results demonstrate the feasibility of using deep learning to automate the identification of parallelizable structures in code, offering a promising tool for software optimization and performance improvement.

Discovering Software Parallelization Points Using Deep Neural Networks

TL;DR

This work tackles automated detection of parallelizable loops in code using deep learning. A synthetic dataset of 4000 Python samples, generated via DEAP genetic algorithms, is used to train DNN and CNN classifiers on tokenized code sequences, with PCA employed for dimensionality reduction. Across 30 independent runs, CNN shows marginally better average performance and superior best-case accuracy (up to 97.67%), with both models displaying robust generalization under varying data retention levels, though worst-case runs reveal notable instability. The study demonstrates feasibility and potential practicality of DL-based parallelization assessment, offering a resource-efficient alternative to RL-based approaches and setting the stage for extensions to real-world code and transformer-based architectures.

Abstract

This study proposes a deep learning-based approach for discovering loops in programming code according to their potential for parallelization. Two genetic algorithm-based code generators were developed to produce two distinct types of code: (i) independent loops, which are parallelizable, and (ii) ambiguous loops, whose dependencies are unclear, making them impossible to define if the loop is parallelizable or not. The generated code snippets were tokenized and preprocessed to ensure a robust dataset. Two deep learning models - a Deep Neural Network (DNN) and a Convolutional Neural Network (CNN) - were implemented to perform the classification. Based on 30 independent runs, a robust statistical analysis was employed to verify the expected performance of both models, DNN and CNN. The CNN showed a slightly higher mean performance, but the two models had a similar variability. Experiments with varying dataset sizes highlighted the importance of data diversity for model performance. These results demonstrate the feasibility of using deep learning to automate the identification of parallelizable structures in code, offering a promising tool for software optimization and performance improvement.

Paper Structure

This paper contains 20 sections, 1 equation, 9 figures, 14 tables.

Figures (9)

  • Figure 1: Fitness evolution across 30 runs of the genetic algorithm. The plot shows the average, maximum, and minimum fitness per generation.
  • Figure 2: Evolutionary process for generating labeled code samples.
  • Figure 3: Boxplots showing the test accuracy (top) and test loss (bottom) of DNN and CNN models using 100% of retained PCA variance across 30 executions.
  • Figure 4: Boxplots showing the test accuracy (top) and test loss (bottom) of the DNN model under different PCA information retention levels (95%, 90%, 85%, and 80%) across 30 independent executions.
  • Figure 5: Boxplots showing the test accuracy (top) and test loss (bottom) of the CNN model under different PCA information retention levels (95%, 90%, 85%, and 80%) across 30 independent executions.
  • ...and 4 more figures