Table of Contents
Fetching ...

Mapping code on Coarse Grained Reconfigurable Arrays using a SAT solver

Cristian Tirelli, Laura Pozzi

TL;DR

The paper tackles the challenge of efficiently mapping loop-intensive DFGs onto coarse-grained reconfigurable arrays (CGRAs) to minimize the iteration interval (II). It introduces Kernel Mobility Schedule (KMS) and a SAT-based CNF formulation (SAT-MapIt) to explore the mapping space and apply modulo scheduling for low II. Experiments on MiBench and Rodinia benchmarks show the approach often yields lower II than state-of-the-art methods and, on average, reduces compilation time, while also providing an open-source tool-chain. The work provides a practical, scalable method for high-quality CGRA mappings and demonstrates the potential for broader architectural applicability.

Abstract

Emerging low-powered architectures like Coarse-Grain Reconfigurable Arrays (CGRAs) are becoming more common. Often included as co-processors, they are used to accelerate compute-intensive workloads like loops. The speedup obtained is defined by the hardware design of the accelerator and by the quality of the compilation. State of the art (SoA) compilation techniques leverage modulo scheduling to minimize the Iteration Interval (II), exploit the architecture parallelism and, consequentially, reduce the execution time of the accelerated workload. In our work, we focus on improving the compilation process by finding the lowest II for any given topology, through a satisfiability (SAT) formulation of the mapping problem. We introduce a novel schedule, called Kernel Mobility Schedule, to encode all the possible mappings for a given Data Flow Graph (DFG) and for a given II. The schedule is used together with the CGRA architectural information to generate all the constraints necessary to find a valid mapping. Experimental results demonstrate that our method not only reduces compilation time on average but also achieves higher quality mappings compared to existing SoA techniques.

Mapping code on Coarse Grained Reconfigurable Arrays using a SAT solver

TL;DR

The paper tackles the challenge of efficiently mapping loop-intensive DFGs onto coarse-grained reconfigurable arrays (CGRAs) to minimize the iteration interval (II). It introduces Kernel Mobility Schedule (KMS) and a SAT-based CNF formulation (SAT-MapIt) to explore the mapping space and apply modulo scheduling for low II. Experiments on MiBench and Rodinia benchmarks show the approach often yields lower II than state-of-the-art methods and, on average, reduces compilation time, while also providing an open-source tool-chain. The work provides a practical, scalable method for high-quality CGRA mappings and demonstrates the potential for broader architectural applicability.

Abstract

Emerging low-powered architectures like Coarse-Grain Reconfigurable Arrays (CGRAs) are becoming more common. Often included as co-processors, they are used to accelerate compute-intensive workloads like loops. The speedup obtained is defined by the hardware design of the accelerator and by the quality of the compilation. State of the art (SoA) compilation techniques leverage modulo scheduling to minimize the Iteration Interval (II), exploit the architecture parallelism and, consequentially, reduce the execution time of the accelerated workload. In our work, we focus on improving the compilation process by finding the lowest II for any given topology, through a satisfiability (SAT) formulation of the mapping problem. We introduce a novel schedule, called Kernel Mobility Schedule, to encode all the possible mappings for a given Data Flow Graph (DFG) and for a given II. The schedule is used together with the CGRA architectural information to generate all the constraints necessary to find a valid mapping. Experimental results demonstrate that our method not only reduces compilation time on average but also achieves higher quality mappings compared to existing SoA techniques.

Paper Structure

This paper contains 11 sections, 1 equation, 4 figures.

Figures (4)

  • Figure 1: a) Abstract $3\times 3$ CGRA architecture with 2d-mesh topology. b) Loop in DFG form. Red edges are loop-carried dependencies; black edges are data dependencies. c) One valid mapping of the DFG on the left in a $2\times 2$ CGRA
  • Figure 2: SAT-MapIt tool-chain iteratively increases the II if no mapping is returned by the SAT solver or if the register allocation phase failed
  • Figure 3: a) Modulo scheduling of the DFG in Figure \ref{['fig:cgra-map']}.b, highlighting the division between prologue, kernel, and epilogue. b) Kernel Mobility Schedule generation. c) One valid mapping of the kernel on a $2\times 2$ CGRA.
  • Figure 4: Experimental results of the chosen benchmarks for 4 different CGRA sizes. We compare the best II obtained by RAMP and PathSeeker with the II found by our tool-chain SAT-MapIt. A red cross means that the process did not terminate before a timeout of 4000 seconds. A black cross means the process was terminated when it reached the max II allowed (50) without finding a feasible solution. Red dashes indicate the minimum II (mII). For the $2\times2$ CGRA, hotspot had a mII of 17, which is not displayed.