ParEVO: Synthesizing Code for Irregular Data: High-Performance Parallelism through Agentic Evolution

Liu Yang; Zeyu Nie; Andrew Liu; Felix Zou; Deniz Altinbüken; Amir Yazdanbakhsh; Quanquan C. Liu

ParEVO: Synthesizing Code for Irregular Data: High-Performance Parallelism through Agentic Evolution

Liu Yang, Zeyu Nie, Andrew Liu, Felix Zou, Deniz Altinbüken, Amir Yazdanbakhsh, Quanquan C. Liu

TL;DR

ParEVO, a framework designed to synthesize high-performance parallel algorithms for irregular data, achieves an average 106x speedup across the suite, and a robust 13.6x speedup specifically on complex irregular graph problems, outperforming state-of-the-art commercial models.

Abstract

The transition from sequential to parallel computing is essential for modern high-performance applications but is hindered by the steep learning curve of concurrent programming. This challenge is magnified for irregular data structures (such as sparse graphs, unbalanced trees, and non-uniform meshes) where static scheduling fails and data dependencies are unpredictable. Current Large Language Models (LLMs) often fail catastrophically on these tasks, generating code plagued by subtle race conditions, deadlocks, and sub-optimal scaling. We bridge this gap with ParEVO, a framework designed to synthesize high-performance parallel algorithms for irregular data. Our contributions include: (1) The Parlay-Instruct Corpus, a curated dataset of 13,820 tasks synthesized via a "Critic-Refine" pipeline that explicitly filters for empirically performant algorithms that effectively utilize Work-Span parallel primitives; (2) specialized DeepSeek, Qwen, and Gemini models fine-tuned to align probabilistic generation with the rigorous semantics of the ParlayLib library; and (3) an Evolutionary Coding Agent (ECA) that improves the "last mile" of correctness by iteratively repairing code using feedback from compilers, dynamic race detectors, and performance profilers. On the ParEval benchmark, ParEVO achieves an average 106x speedup (with a maximum of 1103x) across the suite, and a robust 13.6x speedup specifically on complex irregular graph problems, outperforming state-of-the-art commercial models. Furthermore, our evolutionary approach matches state-of-the-art expert human baselines, achieving up to a 4.1x speedup on specific highly-irregular kernels. Source code and datasets are available at https://github.com/WildAlg/ParEVO.

ParEVO: Synthesizing Code for Irregular Data: High-Performance Parallelism through Agentic Evolution

TL;DR

Abstract

Paper Structure (43 sections, 2 equations, 21 figures, 6 tables)

This paper contains 43 sections, 2 equations, 21 figures, 6 tables.

Introduction
Related Work
Methodology: The ParEVO System
Stage 1: The Parlay-Instruct Fine-Tuning Dataset Corpus
Seed Generation and Mutation
The Critic Loop: Rejection Sampling
Data Verification Pipeline.
Performance Optimization Dataset.
Rust Parlay Primitives
Rust Evolutionary Dataset.
Stage 2: Fine-Tuning DeepSeek, Gemini-2.5, Qwen3 for ParlayLib and Rust RPB
Training Configuration.
Evaluation Environment.
Stage 3: Evolutionary Coding Agent (ECA)
Evolutionary Search Strategy.
...and 28 more sections

Figures (21)

Figure 1: A representative sample from the training corpus. Each sample includes a natural language instruction, the ground-truth parallel implementation, and an executable unit test used for verification.
Figure 2: Comparison of Event Generation Strategies.Left: Code A employs a Map-Scan-Write pattern to enable lock-free parallel writing. Right: Code B relies on sequential push_back, preventing parallelization and incurring reallocation costs.
Figure 3: Overview of the ParEVO Framework. The system integrates human expert context (problem formulation, parallel tooling) with an evolutionary LLM agent. The cycle iteratively refines candidate parallel algorithms through a rigorous evaluation framework (correctness verification, dynamic race detection, and performance profiling), using metrics to guide the selection of the next population via MAP-Elites.
Figure 4: ParEval Metrics Comparison between Gemini-2.5-Pro and Gemini-2.5-Parlay. (a-c) highlight that fine-tuning significantly improves the model's ability to construct valid ParlayLib code, with substantial gains in build and pass rates as well as improved running time over the base model.
Figure 5: Semantic Alignment Example. The base model (top) fails to compile due to incorrect API usage and strict type definitions in the lambda. The fine-tuned model (bottom) correctly identifies sort_inplace and uses auto to handle the complex number types safely.
...and 16 more figures

ParEVO: Synthesizing Code for Irregular Data: High-Performance Parallelism through Agentic Evolution

TL;DR

Abstract

ParEVO: Synthesizing Code for Irregular Data: High-Performance Parallelism through Agentic Evolution

Authors

TL;DR

Abstract

Table of Contents

Figures (21)