Complex event recognition meets hierarchical conjunctive queries
Dante Pinto, Cristian Riveros
TL;DR
This work addresses the challenge of robust, efficient streaming evaluation for complex event queries by bridging Hierarchical Conjunctive Queries ($ ext{HCQ}$) and Complex Event Recognition ($ ext{CER}$). It introduces Parallelized Complex Event Automata ($ ext{PCEA}$), extending Chain Complex Event Automata with parallelization to capture sequencing, disjunction, iteration, correlation, and, crucially, conjunction, and proves that every $ ext{HCQ}$ under bag semantics has an equivalent $ ext{PCEA}$, while $ ext{PCEA}$ precisely captures $ ext{HCQ}$ among acyclic CQs. The paper then develops a streaming evaluation algorithm for unambiguous $ ext{PCEA}$ with equality predicates under sliding windows, achieving logarithmic update time and output-linear enumeration delay by leveraging a specialized data-structure $ extsf{DS}_w$. The results position $ ext{PCEA}$ as a sweet spot that inherits HCQ’s favorable algorithmic properties while enabling order-aware CER pattern processing, with implications for real-time query processing over unbounded data streams. The work also delineates the expressiveness gap: $ ext{PCEA}$ can define certain queries beyond CQ, yet cannot express non-hierarchical acyclic CQ, establishing a tight relationship with HCQ under the presented semantics.
Abstract
Hierarchical conjunctive queries (HCQ) are a subclass of conjunctive queries (CQ) with robust algorithmic properties. Among others, Berkholz, Keppeler, and Schweikardt have shown that HCQ is the subclass of CQ (without projection) that admits dynamic query evaluation with constant update time and constant delay enumeration. On a different but related setting stands Complex Event Recognition (CER), a prominent technology for evaluating sequence patterns over streams. Since one can interpret a data stream as an unbounded sequence of inserts in dynamic query evaluation, it is natural to ask to which extent CER can take advantage of HCQ to find a robust class of queries that can be evaluated efficiently. In this paper, we search to combine HCQ with sequence patterns to find a class of CER queries that can get the best of both worlds. To reach this goal, we propose a class of complex event automata model called Parallelized Complex Event Automata (PCEA) for evaluating CER queries with correlation (i.e., joins) over streams. This model allows us to express sequence patterns and compare values among tuples, but it also allows us to express conjunctions by incorporating a novel form of non-determinism that we call parallelization. We show that for every HCQ (under bag semantics), we can construct an equivalent PCEA. Further, we show that HCQ is the biggest class of acyclic CQ that this automata model can define. Then, PCEA stands as a sweet spot that precisely expresses HCQ (i.e., among acyclic CQ) and extends them with sequence patterns. Finally, we show that PCEA also inherits the good algorithmic properties of HCQ by presenting a streaming evaluation algorithm under sliding windows with logarithmic update time and output-linear delay for the class of PCEA with equality predicates.
