From PyTorch to Calyx: An Open-Source Compiler Toolchain for ML Accelerators
Jiahan Xie, Evan Williams, Adrian Sampson
TL;DR
The paper tackles translating PyTorch models to FPGA-ready hardware through an open-source stack that leverages Allo, CIRCT, and Calyx to generate synthesizable SystemVerilog. It introduces memory banking and control-structure optimizations to enable safe data parallelism and efficient hardware execution. Through FFNN, CNN, and MHA benchmarks, the authors show that while Calyx lags commercial tools in baseline scheduling, aggressive memory partitioning yields substantial performance gains and demonstrates the viability of an open-source accelerator compilation route. The work highlights the potential of Calyx as a flexible research platform for hardware accelerator design with full open-source tooling.
Abstract
We present an end-to-end open-source compiler toolchain that targets synthesizable SystemVerilog from ML models written in PyTorch. Our toolchain leverages the accelerator design language Allo, the hardware intermediate representation (IR) Calyx, and the CIRCT project under LLVM. We also implement a set of compiler passes for memory partitioning, enabling effective parallelism in memory-intensive ML workloads. Experimental results demonstrate that our compiler can effectively generate optimized FPGA-implementable hardware designs that perform reasonably well against closed-source industry-grade tools such as Vitis HLS.
