Table of Contents
Fetching ...

From Batch to Stream: Automatic Generation of Online Algorithms

Ziteng Wang, Shankara Pailoor, Aaryan Prakash, Yuepeng Wang, Isil Dillig

TL;DR

The paper tackles the problem of automatically generating online streaming algorithms from offline batch implementations. It introduces relational function signatures (RFS) to relate online state to offline computations and proves soundness and completeness under mild conditions, then presents a concrete synthesis algorithm that decomposes the problem into independent subproblems. Implemented as Opera, the approach combines symbolic reasoning with search, QE-based implicate generation, and template-based interpolation to synthesize online schemes that are inductive relative to the inferred RFS. Empirically, Opera solves 50 of 51 benchmarks across statistics and online auctions, outperforming adapted SyGuS baselines and demonstrating the practical viability of automated online algorithm synthesis. The work advances automatic incremental computation by providing a scalable, domain-agnostic workflow that yields online variants with comparable results to their offline originals, enabling broader deployment in streaming analytics contexts.

Abstract

Online streaming algorithms, tailored for continuous data processing, offer substantial benefits but are often more intricate to design than their offline counterparts. This paper introduces a novel approach for automatically synthesizing online streaming algorithms from their offline versions. In particular, we propose a novel methodology, based on the notion of relational function signature (RFS), for deriving an online algorithm given its offline version. Then, we propose a concrete synthesis algorithm that is an instantiation of the proposed methodology. Our algorithm uses the RFS to decompose the synthesis problem into a set of independent subtasks and uses a combination of symbolic reasoning and search to solve each subproblem. We implement the proposed technique in a new tool called Opera and evaluate it on over 50 tasks spanning two domains: statistical computations and online auctions. Our results show that Opera can automatically derive the online version of the original algorithm for 98% of the tasks. Our experiments also demonstrate that Opera significantly outperforms alternative approaches, including adaptations of SyGuS solvers to this problem as well as two of Opera's own ablations.

From Batch to Stream: Automatic Generation of Online Algorithms

TL;DR

The paper tackles the problem of automatically generating online streaming algorithms from offline batch implementations. It introduces relational function signatures (RFS) to relate online state to offline computations and proves soundness and completeness under mild conditions, then presents a concrete synthesis algorithm that decomposes the problem into independent subproblems. Implemented as Opera, the approach combines symbolic reasoning with search, QE-based implicate generation, and template-based interpolation to synthesize online schemes that are inductive relative to the inferred RFS. Empirically, Opera solves 50 of 51 benchmarks across statistics and online auctions, outperforming adapted SyGuS baselines and demonstrating the practical viability of automated online algorithm synthesis. The work advances automatic incremental computation by providing a scalable, domain-agnostic workflow that yields online variants with comparable results to their offline originals, enabling broader deployment in streaming analytics contexts.

Abstract

Online streaming algorithms, tailored for continuous data processing, offer substantial benefits but are often more intricate to design than their offline counterparts. This paper introduces a novel approach for automatically synthesizing online streaming algorithms from their offline versions. In particular, we propose a novel methodology, based on the notion of relational function signature (RFS), for deriving an online algorithm given its offline version. Then, we propose a concrete synthesis algorithm that is an instantiation of the proposed methodology. Our algorithm uses the RFS to decompose the synthesis problem into a set of independent subtasks and uses a combination of symbolic reasoning and search to solve each subproblem. We implement the proposed technique in a new tool called Opera and evaluate it on over 50 tasks spanning two domains: statistical computations and online auctions. Our results show that Opera can automatically derive the online version of the original algorithm for 98% of the tasks. Our experiments also demonstrate that Opera significantly outperforms alternative approaches, including adaptations of SyGuS solvers to this problem as well as two of Opera's own ablations.
Paper Structure (22 sections, 7 theorems, 35 equations, 19 figures, 2 tables, 6 algorithms)

This paper contains 22 sections, 7 theorems, 35 equations, 19 figures, 2 tables, 6 algorithms.

Key Result

theorem 1

Let $\mathcal{P} = \lambda xs. E$ be an offline program and $\mathcal{S} = (\mathcal{I}, \mathcal{P}')$ an online scheme. Let $\Phi(xs, y)$ be an RFS between $\mathcal{P}$ and $\mathcal{P}'$ such that $\Phi[y_1] = E$. Then, if $\mathcal{S} \models \Phi$, we have $\mathcal{P} \simeq \mathcal{S}$.

Figures (19)

  • Figure 1: Schematic illustration of our synthesis methodology
  • Figure 2: Offline and online algorithms for computing variance, implemented in Python. In the online version, x corresponds to the new stream element, and the first four are auxiliary parameters. We assume that all division operators are safe, meaning that they produce a default value of $0$ when the denominator is $0$.
  • Figure 3: Intermediate presentation of variance computation.
  • Figure 4: Relational Function Signature for Welford's algorithm
  • Figure 5: Sketch generated by Opera for variance.
  • ...and 14 more figures

Theorems & Definitions (12)

  • definition 1: Online-Offline Equivalence
  • definition 2: Relational Function Signature
  • definition 3: Inductiveness relative to RFS
  • definition 4: Model of RFS
  • theorem 1
  • theorem 2
  • definition 5
  • theorem 3
  • theorem 4
  • Lemma 1
  • ...and 2 more