From Batch to Stream: Automatic Generation of Online Algorithms
Ziteng Wang, Shankara Pailoor, Aaryan Prakash, Yuepeng Wang, Isil Dillig
TL;DR
The paper tackles the problem of automatically generating online streaming algorithms from offline batch implementations. It introduces relational function signatures (RFS) to relate online state to offline computations and proves soundness and completeness under mild conditions, then presents a concrete synthesis algorithm that decomposes the problem into independent subproblems. Implemented as Opera, the approach combines symbolic reasoning with search, QE-based implicate generation, and template-based interpolation to synthesize online schemes that are inductive relative to the inferred RFS. Empirically, Opera solves 50 of 51 benchmarks across statistics and online auctions, outperforming adapted SyGuS baselines and demonstrating the practical viability of automated online algorithm synthesis. The work advances automatic incremental computation by providing a scalable, domain-agnostic workflow that yields online variants with comparable results to their offline originals, enabling broader deployment in streaming analytics contexts.
Abstract
Online streaming algorithms, tailored for continuous data processing, offer substantial benefits but are often more intricate to design than their offline counterparts. This paper introduces a novel approach for automatically synthesizing online streaming algorithms from their offline versions. In particular, we propose a novel methodology, based on the notion of relational function signature (RFS), for deriving an online algorithm given its offline version. Then, we propose a concrete synthesis algorithm that is an instantiation of the proposed methodology. Our algorithm uses the RFS to decompose the synthesis problem into a set of independent subtasks and uses a combination of symbolic reasoning and search to solve each subproblem. We implement the proposed technique in a new tool called Opera and evaluate it on over 50 tasks spanning two domains: statistical computations and online auctions. Our results show that Opera can automatically derive the online version of the original algorithm for 98% of the tasks. Our experiments also demonstrate that Opera significantly outperforms alternative approaches, including adaptations of SyGuS solvers to this problem as well as two of Opera's own ablations.
