Table of Contents
Fetching ...

A Data-driven Analysis of Code Optimizations

Yacine Hakimi, Riyadh Baghdadi

TL;DR

This paper addresses the challenge of designing autoschedulers that automatically select and sequence code optimizations. It adopts a data-driven approach, generating two large synthetic datasets and systematically analyzing how transformations such as loop interchange, skewing, tiling, parallelization, and unrolling interact, including their order, repetition, and depth of parallelism. The authors derive actionable rules—such as prioritizing the outer 30% of loops for parallelization, using skewing to enable parallelism (but not for locality), capping schedule length at 5, and a data-driven but sometimes reconfigurable transformation order—and demonstrate that integrating these rules into the LOOPer autoscheduler yields substantial speedups (geometric mean ~1.15×) and faster search (≈1.24×) on PolyBench benchmarks. The work provides a practical, generalizable framework for data-driven autoschedulers, with clear guidance for reducing search space while preserving optimization potential and evidence of transferability to unseen functions.

Abstract

As the demand for computational power grows, optimizing code through compilers becomes increasingly crucial. In this context, we focus on fully automatic code optimization techniques that automate the process of selecting and applying code transformations for better performance without manual intervention. Understanding how these transformations behave and interact is key to designing more effective optimization strategies. Compiler developers must make numerous design choices when constructing these heuristics. For instance, they may decide whether to allow transformations to be explored in any arbitrary order or to enforce a fixed sequence. While the former may theoretically offer the best performance gains, it significantly increases the search space. This raises an important question: Can a predefined, fixed order of applying transformations speed up the search without severely compromising optimization potential? In this paper, we address this and other related questions that arise in the design of automatic code optimization algorithms. Using a data-driven approach, we generate a large dataset of random programs, apply random optimization sequences, and record their execution times. Through statistical analysis, we provide insights that guide the development of more efficient automatic code optimization algorithms.

A Data-driven Analysis of Code Optimizations

TL;DR

This paper addresses the challenge of designing autoschedulers that automatically select and sequence code optimizations. It adopts a data-driven approach, generating two large synthetic datasets and systematically analyzing how transformations such as loop interchange, skewing, tiling, parallelization, and unrolling interact, including their order, repetition, and depth of parallelism. The authors derive actionable rules—such as prioritizing the outer 30% of loops for parallelization, using skewing to enable parallelism (but not for locality), capping schedule length at 5, and a data-driven but sometimes reconfigurable transformation order—and demonstrate that integrating these rules into the LOOPer autoscheduler yields substantial speedups (geometric mean ~1.15×) and faster search (≈1.24×) on PolyBench benchmarks. The work provides a practical, generalizable framework for data-driven autoschedulers, with clear guidance for reducing search space while preserving optimization potential and evidence of transferability to unseen functions.

Abstract

As the demand for computational power grows, optimizing code through compilers becomes increasingly crucial. In this context, we focus on fully automatic code optimization techniques that automate the process of selecting and applying code transformations for better performance without manual intervention. Understanding how these transformations behave and interact is key to designing more effective optimization strategies. Compiler developers must make numerous design choices when constructing these heuristics. For instance, they may decide whether to allow transformations to be explored in any arbitrary order or to enforce a fixed sequence. While the former may theoretically offer the best performance gains, it significantly increases the search space. This raises an important question: Can a predefined, fixed order of applying transformations speed up the search without severely compromising optimization potential? In this paper, we address this and other related questions that arise in the design of automatic code optimization algorithms. Using a data-driven approach, we generate a large dataset of random programs, apply random optimization sequences, and record their execution times. Through statistical analysis, we provide insights that guide the development of more efficient automatic code optimization algorithms.

Paper Structure

This paper contains 38 sections, 7 figures.

Figures (7)

  • Figure 1: Variation of Mean Speedup enabled by parallelization relatively to the relative loop level
  • Figure 2: Speedup variation based on unrolling factors - Dataset A
  • Figure 3: Speedup variation based on unrolling factors - Dataset B
  • Figure 4: Number of functions per schedule length
  • Figure 5: Mean and Maximal speedup by schedule length
  • ...and 2 more figures