Evaluating Versal AI Engines for option price discovery in market risk analysis
Mark Klaisoongnoen, Nick Brown, Tim Dykes, Jessica R. Jones, Utz-Uwe Haus
TL;DR
This paper evaluates whether Versal ACAP's AI Engines can accelerate a standard market-risk benchmark (SIMR/STAC-A2) by porting the VariancePathQE kernel to AIEs and coupling it with Programmable Logic. Across single- and multi-AIE configurations, and in comparison with a PL-only implementation on Versal and on an Alveo U280, the study finds that AIE-based designs underperform the optimized PL solution due to architectural and tooling constraints, including dataflow inefficiencies, loopback overhead, and AXI-interface bottlenecks. The work provides detailed insights into dataflow design, AIE vectorization, and PL-AIE integration challenges, offering practical guidance for designing FPGA-based financial analytics accelerators. The findings underscore the current limitations of AIE-centric approaches for tightly coupled, cycle-accurate financial workloads and emphasize optimizing PL pathways for performance-critical market risk analysis. These insights are broadly applicable to high-performance numerical modelling on next-generation FPGA platforms.
Abstract
Whilst Field-Programmable Gate Arrays (FPGAs) have been popular in accelerating high-frequency financial workload for many years, their application in quantitative finance, the utilisation of mathematical models to analyse financial markets and securities, is less mature. Nevertheless, recent work has demonstrated the benefits that FPGAs can deliver to quantitative workloads, and in this paper, we study whether the Versal ACAP and its AI Engines (AIEs) can also deliver improved performance. We focus specifically on the industry standard Strategic Technology Analysis Center's (STAC) derivatives risk analysis benchmark STAC-A2. Porting a purely FPGA-based accelerator STAC-A2 inspired market risk (SIMR) benchmark to the Versal ACAP device by combining Programmable Logic (PL) and AIEs, we explore the development approach and techniques, before comparing performance across PL and AIEs. Ultimately, we found that our AIE approach is slower than a highly optimised existing PL-only version due to limits on both the AIE and PL that we explore and describe.
