Super-Linear Speedup by Generalizing Runtime Repeated Recursion Unfolding in Prolog
Thom Fruehwirth
TL;DR
The paper addresses accelerating recursive programs by generalizing runtime repeated recursion unfolding to Prolog, enabling super-linear speedups for multiple and multi-rule recursions. It introduces a lean implementation comprising an unfolder, a generalized meta-interpreter, and a round-robin rule processor, all integrated as a ten-clause Prolog framework. Through benchmarks on summation, Fibonacci, and GCD, the approach demonstrates substantial, often super-linear, speedups and provides complexity analyses that support the observed behavior. This work advances online program optimization for recursive predicates and suggests potential extensions to mutual recursion and other programming languages.
Abstract
Runtime repeated recursion unfolding was recently introduced as a just-in-time program transformation strategy that can achieve super-linear speedup. So far, the method was restricted to single linear direct recursive rules in the programming language Constraint Handling Rules (CHR). In this companion paper, we generalize the technique to multiple recursion and to multiple recursive rules and provide an implementation of the generalized method in the logic programming language Prolog. The basic idea of the approach is as follows: When a recursive call is encountered at runtime, the recursive rule is unfolded with itself and this process is repeated with each resulting unfolded rule as long as it is applicable to the current call. In this way, more and more recursive steps are combined into one recursive step. Then an interpreter applies these rules to the call starting from the most unfolded rule. For recursions which have sufficiently simplifyable unfoldings, a super-linear can be achieved, i.e. the time complexity is reduced. We implement an unfolder, a generalized meta-interpreter and a novel round-robin rule processor for our generalization of runtime repeated recursion unfolding with just ten clauses in Prolog. We illustrate the feasibility of our technique with worst-case time complexity estimates and benchmarks for some basic classical algorithms that achieve a super-linear speedup.
