Improving Memory Dependence Prediction with Static Analysis
Luke Panayi, Rohan Gandhi, Jim Whittaker, Vassilios Chouliaras, Martin Berger, Paul Kelly
TL;DR
This work addresses memory dependence prediction (MDP) in Out-of-Order CPUs by proposing a static-analysis-based approach to pre-label loads as Predict No Dependency (PND), allowing them to bypass MDP lookups. Implemented as an LLVM IR pass, PND labels are communicated to an AArch64-enabled CPU via minimally invasive opcode changes, enabling zero-cost interaction with hardware while preserving correctness at commit. Simulation in Gem5 over a Spec2017 subset shows an average reduction of MDP lookups by $13\%$ and CPI gains up to $0.7\%$, with some benchmarks reaching larger improvements, especially when MDP tables are smaller. The results indicate that static analysis can provide meaningful, near-zero-cost performance benefits and motivate further exploration of IR-based labeling and more advanced MDP algorithms for future CPU designs.
Abstract
This paper explores the potential of communicating information gained by static analysis from compilers to Out-of-Order (OoO) machines, focusing on the memory dependence predictor (MDP). The MDP enables loads to issue without all in-flight store addresses being known, with minimal memory order violations. We use LLVM to find loads with no dependencies and label them via their opcode. These labelled loads skip making lookups into the MDP, improving prediction accuracy by reducing false dependencies. We communicate this information in a minimally intrusive way, i.e.~without introducing additional hardware costs or instruction bandwidth, providing these improvements without any additional overhead in the CPU. We find that in select cases in Spec2017, a significant number of load instructions can skip interacting with the MDP and lead to a performance gain. These results point to greater possibilities for static analysis as a source of near zero cost performance gains in future CPU designs.
