The Impact of Partial Computations on the Red-Blue Pebble Game
Pál András Papp, Aleksandros Sobczyk, A. N. Yzelman
TL;DR
The paper extends the red-blue pebble game to allow partial computations (PRBP) to better capture I/O costs in associative computations. It develops fundamental properties, motivating examples, and gadgets to illustrate how partial computations can reduce I/O costs, while showing that lower-bound tools must be adapted from RBP. It introduces edge- and dominator-based partition concepts to derive PRBP lower bounds and demonstrates that for canonical tasks like FFT, matrix multiplication, and Flash Attention, PRBP bounds match the known RBP bounds, with some exceptions where PRBP affords substantial savings in specific DAGs. The work also proves NP-hardness of deciding and approximating the PRBP optimum, and discusses alternative model variants and practical directions for future research in I/O-efficient computation modeling and analysis.
Abstract
We study an extension of the well-known red-blue pebble game (RBP) with partial computation steps, inspired by the recent work of Sobczyk. While the original RBP assumes that we need to have all the inputs of an operation in fast memory at the same time, in many concrete computations, the inputs can be aggregated one by one into the final output value. These partial computation steps can enable pebbling strategies with much smaller I/O cost, and in settings where such a step-by-step aggregation is possible, this extended red-blue pebble game offers a much more realistic cost model. We establish the fundamental properties of this partial-computing red-blue pebble game (PRBP), and compare it to the original RBP. We begin with some simple examples where allowing partial computations can decrease the optimal I/O cost. It is also shown that the cost can decrease by up to a linear factor this way, but in general, it is NP-hard to decide whether partial computations allow for a smaller cost in a specific DAG. We then discuss how $S$-partitions, a crucial tool for deriving I/O lower bounds in RBP, can be adapted to the PRBP model. These new tools are then used to establish lower bounds on the I/O cost of some prominent computational tasks. Finally, we also adapt a hardness result from RBP, showing that the optimum cost is still NP-hard to approximate in PRBP to any reasonable factor.
