F-IVM: Analytics over Relational Databases under Updates
Ahmet Kara, Milos Nikolic, Dan Olteanu, Haozhe Zhang
TL;DR
F-IVM delivers a unified framework for maintaining a wide range of analytics over evolving relational data by combining higher-order incremental view maintenance, factorized computation, and a ring-based algebra. It introduces a data and query model over semirings, a variable-order-driven view-tree architecture, and a factoring strategy that enables efficient incremental maintenance for both classic and ML-centric analytics, including covariance-based linear regression, Chow-Liu trees, and matrix chain multiplication. The system achieves orders-of-magnitude improvements in runtime and memory relative to state-of-the-art IVM approaches, while supporting complex tasks through a cohesive ring abstraction and factorized representations. These results suggest substantial practical impact for real-time in-database analytics and provide a foundation for extending to broader ML and graphical-model workloads.
Abstract
This article describes F-IVM, a unified approach for maintaining analytics over changing relational data. We exemplify its versatility in four disciplines: processing queries with group-by aggregates and joins; learning linear regression models using the covariance matrix of the input features; building Chow-Liu trees using pairwise mutual information of the input features; and matrix chain multiplication. F-IVM has three main ingredients: higher-order incremental view maintenance; factorized computation; and ring abstraction. F-IVM reduces the maintenance of a task to that of a hierarchy of simple views. Such views are functions mapping keys, which are tuples of input values, to payloads, which are elements from a ring. F-IVM also supports efficient factorized computation over keys, payloads, and updates. Finally, F-IVM treats uniformly seemingly disparate tasks. In the key space, all tasks require joins and variable marginalization. In the payload space, tasks differ in the definition of the sum and product ring operations. We implemented F-IVM on top of DBToaster and show that it can outperform classical first-order and fully recursive higher-order incremental view maintenance by orders of magnitude while using less memory.
