Table of Contents
Fetching ...

From PyTorch to Calyx: An Open-Source Compiler Toolchain for ML Accelerators

Jiahan Xie, Evan Williams, Adrian Sampson

TL;DR

The paper tackles translating PyTorch models to FPGA-ready hardware through an open-source stack that leverages Allo, CIRCT, and Calyx to generate synthesizable SystemVerilog. It introduces memory banking and control-structure optimizations to enable safe data parallelism and efficient hardware execution. Through FFNN, CNN, and MHA benchmarks, the authors show that while Calyx lags commercial tools in baseline scheduling, aggressive memory partitioning yields substantial performance gains and demonstrates the viability of an open-source accelerator compilation route. The work highlights the potential of Calyx as a flexible research platform for hardware accelerator design with full open-source tooling.

Abstract

We present an end-to-end open-source compiler toolchain that targets synthesizable SystemVerilog from ML models written in PyTorch. Our toolchain leverages the accelerator design language Allo, the hardware intermediate representation (IR) Calyx, and the CIRCT project under LLVM. We also implement a set of compiler passes for memory partitioning, enabling effective parallelism in memory-intensive ML workloads. Experimental results demonstrate that our compiler can effectively generate optimized FPGA-implementable hardware designs that perform reasonably well against closed-source industry-grade tools such as Vitis HLS.

From PyTorch to Calyx: An Open-Source Compiler Toolchain for ML Accelerators

TL;DR

The paper tackles translating PyTorch models to FPGA-ready hardware through an open-source stack that leverages Allo, CIRCT, and Calyx to generate synthesizable SystemVerilog. It introduces memory banking and control-structure optimizations to enable safe data parallelism and efficient hardware execution. Through FFNN, CNN, and MHA benchmarks, the authors show that while Calyx lags commercial tools in baseline scheduling, aggressive memory partitioning yields substantial performance gains and demonstrates the viability of an open-source accelerator compilation route. The work highlights the potential of Calyx as a flexible research platform for hardware accelerator design with full open-source tooling.

Abstract

We present an end-to-end open-source compiler toolchain that targets synthesizable SystemVerilog from ML models written in PyTorch. Our toolchain leverages the accelerator design language Allo, the hardware intermediate representation (IR) Calyx, and the CIRCT project under LLVM. We also implement a set of compiler passes for memory partitioning, enabling effective parallelism in memory-intensive ML workloads. Experimental results demonstrate that our compiler can effectively generate optimized FPGA-implementable hardware designs that perform reasonably well against closed-source industry-grade tools such as Vitis HLS.

Paper Structure

This paper contains 14 sections, 3 figures, 2 tables.

Figures (3)

  • Figure 1: Compilation pipeline from PyTorch through Allo to Calyx.
  • Figure 2: Wall-clock latency comparison across models.
  • Figure 3: Latency vs. partition factor for FFNN.