Table of Contents
Fetching ...

Polynomial Time Convergence of the Iterative Evaluation of Datalogo Programs

Sungjin Im, Benjamin Moseley, Hung Q. Ngo, Kirk Pruhs

TL;DR

The paper addresses whether the naïve iterative evaluation of $ extsf{Datalog}^ullet$ programs over $p$-stable semirings converges in polynomial time. It develops a technical framework built on a strengthened Parikh's theorem, small-support linear representations, and CFL-based analysis to bound the iteration depth. The main result is a polynomial convergence bound of $O( \sigma p n^2( n^2 \lg \lambda + \lg \sigma))$ iterations, extending polynomial-rate convergence from linear cases to general nonlinear Datalogo programs under stability. This advances understanding of fixpoint computation with recursive aggregation and informs practical evaluation strategies in data analytics where semiring semantics are used.

Abstract

Datalogo is an extension of Datalog that allows for aggregation and recursion over an arbitrary commutative semiring. Like Datalog, Datalogo programs can be evaluated via the natural iterative algorithm until a fixed point is reached. However unlike Datalog, the natural iterative evaluation of some Datalogo programs over some semirings may not converge. It is known that the commutative semirings for which the iterative evaluation of Datalogo programs is guaranteed to converge are exactly those semirings that are stable [7]. Previously, the best known upper bound on the number of iterations until convergence over $p$-stable semirings is $\sum_{i=1}^n (p+2)^i = Θ(p^n)$ steps, where $n$ is (essentially) the output size. We establish that, in fact, the natural iterative evaluation of a Datalogoprogram over a $p$-stable semiring converges within a polynomial number of iterations. In particular our upper bound is $O( σp n^2( n^2 \lg λ+ \lg σ))$ where $σ$ is the number of elements in the semiring present in either the input databases or the Datalogo program, and $λ$ is the maximum number of terms in any product in the Datalogo program.

Polynomial Time Convergence of the Iterative Evaluation of Datalogo Programs

TL;DR

The paper addresses whether the naïve iterative evaluation of programs over -stable semirings converges in polynomial time. It develops a technical framework built on a strengthened Parikh's theorem, small-support linear representations, and CFL-based analysis to bound the iteration depth. The main result is a polynomial convergence bound of iterations, extending polynomial-rate convergence from linear cases to general nonlinear Datalogo programs under stability. This advances understanding of fixpoint computation with recursive aggregation and informs practical evaluation strategies in data analytics where semiring semantics are used.

Abstract

Datalogo is an extension of Datalog that allows for aggregation and recursion over an arbitrary commutative semiring. Like Datalog, Datalogo programs can be evaluated via the natural iterative algorithm until a fixed point is reached. However unlike Datalog, the natural iterative evaluation of some Datalogo programs over some semirings may not converge. It is known that the commutative semirings for which the iterative evaluation of Datalogo programs is guaranteed to converge are exactly those semirings that are stable [7]. Previously, the best known upper bound on the number of iterations until convergence over -stable semirings is steps, where is (essentially) the output size. We establish that, in fact, the natural iterative evaluation of a Datalogoprogram over a -stable semiring converges within a polynomial number of iterations. In particular our upper bound is where is the number of elements in the semiring present in either the input databases or the Datalogo program, and is the maximum number of terms in any product in the Datalogo program.
Paper Structure (14 sections, 8 theorems, 53 equations, 2 figures)

This paper contains 14 sections, 8 theorems, 53 equations, 2 figures.

Key Result

Theorem 1.1

Let $\bm S$ be a $p$-stable commutative semiring. Let $P$ be a $\text{\sf Datalog}^\circ$ program where the maximum number of multiplicands in any product is at most $\lambda$. Let $D$ be the input database instance. Let $\sigma$ be number of the semiring elements referenced in $P$ or $D$. Let $n$ d steps.

Figures (2)

  • Figure 1: Illustration of a wedge $W$
  • Figure 2: The right tree $T$ is recovered from the left tree $T'$ by augmenting the wedge $W$.

Theorems & Definitions (18)

  • Theorem 1.1
  • Example 1
  • Example 2
  • Example 3
  • Example 4
  • Example 5
  • Theorem 3.1
  • Lemma 3.2
  • Definition 4.1
  • Lemma 4.2
  • ...and 8 more