Runtime Repeated Recursion Unfolding in CHR: A Just-In-Time Online Program Optimization Strategy That Can Achieve Super-Linear Speedup
Thom Fruehwirth
TL;DR
This work addresses speeding up linear direct recursive computations by a just-in-time strategy called runtime repeated recursion unfolding, which online unfolds recursive rules to produce specialized variants that cover multiple steps. A lean CHR+Prolog implementation combines an unfolder that generates unfolded rules with a meta-interpreter that applies them in an optimal order, ignoring the base case. The authors prove correctness, derive time-complexity recurrences, and establish sufficient and necessary conditions for super-linear speedups, supported by benchmarks on summation, list reversal, and sorting. The results show dramatic improvements, sometimes orders of magnitude faster than the original recursion, validating the feasibility and practicality of online program optimization via recursive unfolding. They also discuss limitations (need for problem-specific simplifications) and future work to extend to multiple recursions and other languages.
Abstract
We introduce a just-in-time runtime program transformation strategy based on repeated recursion unfolding. Our online program optimization generates several versions of a recursion differentiated by the minimal number of recursive steps covered. The base case of the recursion is ignored in our technique. Our method is introduced here on the basis of single linear direct recursive rules. When a recursive call is encountered at runtime, first an unfolder creates specializations of the associated recursive rule on-the-fly and then an interpreter applies these rules to the call. Our approach reduces the number of recursive rule applications to its logarithm at the expense of introducing a logarithmic number of generic unfolded rules. We prove correctness of our online optimization technique and determine its time complexity. For recursions which have enough simplifyable unfoldings, a super-linear is possible, i.e. speedup by more than a constant factor. The necessary simplification is problem-specific and has to be provided at compile-time. In our speedup analysis, we prove a sufficient condition as well as a sufficient and necessary condition for super-linear speedup relating the complexity of the recursive steps of the original rule and the unfolded rules. We have implemented an unfolder and meta-interpreter for runtime repeated recursion unfolding with just five rules in Constraint Handling Rules (CHR) embedded in Prolog. We illustrate the feasibility of our approach with simplifications, time complexity results and benchmarks for some basic tractable algorithms. The simplifications require some insight and were derived manually. The runtime improvement quickly reaches several orders of magnitude, consistent with the super-linear speedup predicted by our theorems.
