Table of Contents
Fetching ...

Convergence of Datalog over (Pre-) Semirings

Mahmoud Abo Khamis, Hung Q. Ngo, Reinhard Pichler, Dan Suciu, Yisu Remy Wang

TL;DR

This paper considers an ordered semiring, defines the semantics of a datalog program as a least fixpoint in this semiring, and describes a class of ordered semirings on which one can use the semi-naive evaluation algorithm on any datalog program.

Abstract

Recursive queries have been traditionally studied in the framework of datalog, a language that restricts recursion to monotone queries over sets, which is guaranteed to converge in polynomial time in the size of the input. But modern big data systems require recursive computations beyond the Boolean space. In this paper we study the convergence of datalog when it is interpreted over an arbitrary semiring. We consider an ordered semiring, define the semantics of a datalog program as a least fixpoint in this semiring, and study the number of steps required to reach that fixpoint, if ever. We identify algebraic properties of the semiring that correspond to certain convergence properties of datalog programs. Finally, we describe a class of ordered semirings on which one can use the semi-naïve evaluation algorithm on any datalog program.

Convergence of Datalog over (Pre-) Semirings

TL;DR

This paper considers an ordered semiring, defines the semantics of a datalog program as a least fixpoint in this semiring, and describes a class of ordered semirings on which one can use the semi-naive evaluation algorithm on any datalog program.

Abstract

Recursive queries have been traditionally studied in the framework of datalog, a language that restricts recursion to monotone queries over sets, which is guaranteed to converge in polynomial time in the size of the input. But modern big data systems require recursive computations beyond the Boolean space. In this paper we study the convergence of datalog when it is interpreted over an arbitrary semiring. We consider an ordered semiring, define the semantics of a datalog program as a least fixpoint in this semiring, and study the number of steps required to reach that fixpoint, if ever. We identify algebraic properties of the semiring that correspond to certain convergence properties of datalog programs. Finally, we describe a class of ordered semirings on which one can use the semi-naïve evaluation algorithm on any datalog program.

Paper Structure

This paper contains 39 sections, 27 theorems, 125 equations, 6 figures, 3 algorithms.

Key Result

Theorem 1.2

Given a POPS $\bm P$, the following hold.

Figures (6)

  • Figure 1: Computing the fixpoint of $(f,g)$.
  • Figure 2: A graph illustrating Example \ref{['ex:reachability:sssp']} (a) and Example \ref{['ex:sum1:sum2']} (b)
  • Figure 3: $X$-Parse trees of depth $\leq 2$ for the grammar in Example \ref{['ex:fundamental']}
  • Figure 4: Example graph with edges $E = \{(a,b), (a,c), (b,a), (c, d)$, $(c,e), (d,e), (e,f)\}$, used for the win-move game.
  • Figure 5: Orders $\leq_k$ and $\leq_t$ in the bilattice FOUR DBLP:journals/jlp/Fitting91
  • ...and 1 more figures

Theorems & Definitions (69)

  • Example 1.1
  • Theorem 1.2
  • Definition 2.1: (Pre-)semiring
  • Example 2.2
  • Definition 2.3: POPS
  • Proposition 2.4
  • proof
  • Definition 2.5
  • Example 2.6
  • Definition 2.7
  • ...and 59 more