Table of Contents
Fetching ...

Accelerating Machine Learning Queries with Linear Algebra Query Processing

Wenbo Sun, Asterios Katsifodimos, Rihan Hai

TL;DR

The paper addresses the bottleneck of real time ML predictions in end-to-end pipelines by unifying relational data processing and ML computations through Linear Algebra Queries (LAQ). It proposes an operator fusion framework that pushes linear algebraic ML operators into dimension tables within a star schema, enabled by GPU friendly matrix representations of relational operators. Through complexity analyses and extensive experiments on the Star Schema Benchmark and synthetic data, it demonstrates end-to-end speedups up to $317\times$ and characterizes when fusion yields the most benefit via the ratio $\frac{k}{l}$. The work highlights practical performance trade-offs, including pre fusion costs and data update patterns, and suggests directions for cost driven fusion strategies and additional fusion rules for deeper ML models.

Abstract

The rapid growth of large-scale machine learning (ML) models has led numerous commercial companies to utilize ML models for generating predictive results to help business decision-making. As two primary components in traditional predictive pipelines, data processing, and model predictions often operate in separate execution environments, leading to redundant engineering and computations. Additionally, the diverging mathematical foundations of data processing and machine learning hinder cross-optimizations by combining these two components, thereby overlooking potential opportunities to expedite predictive pipelines. In this paper, we propose an operator fusing method based on GPU-accelerated linear algebraic evaluation of relational queries. Our method leverages linear algebra computation properties to merge operators in machine learning predictions and data processing, significantly accelerating predictive pipelines by up to 317x. We perform a complexity analysis to deliver quantitative insights into the advantages of operator fusion, considering various data and model dimensions. Furthermore, we extensively evaluate matrix multiplication query processing utilizing the widely-used Star Schema Benchmark. Through comprehensive evaluations, we demonstrate the effectiveness and potential of our approach in improving the efficiency of data processing and machine learning workloads on modern hardware.

Accelerating Machine Learning Queries with Linear Algebra Query Processing

TL;DR

The paper addresses the bottleneck of real time ML predictions in end-to-end pipelines by unifying relational data processing and ML computations through Linear Algebra Queries (LAQ). It proposes an operator fusion framework that pushes linear algebraic ML operators into dimension tables within a star schema, enabled by GPU friendly matrix representations of relational operators. Through complexity analyses and extensive experiments on the Star Schema Benchmark and synthetic data, it demonstrates end-to-end speedups up to and characterizes when fusion yields the most benefit via the ratio . The work highlights practical performance trade-offs, including pre fusion costs and data update patterns, and suggests directions for cost driven fusion strategies and additional fusion rules for deeper ML models.

Abstract

The rapid growth of large-scale machine learning (ML) models has led numerous commercial companies to utilize ML models for generating predictive results to help business decision-making. As two primary components in traditional predictive pipelines, data processing, and model predictions often operate in separate execution environments, leading to redundant engineering and computations. Additionally, the diverging mathematical foundations of data processing and machine learning hinder cross-optimizations by combining these two components, thereby overlooking potential opportunities to expedite predictive pipelines. In this paper, we propose an operator fusing method based on GPU-accelerated linear algebraic evaluation of relational queries. Our method leverages linear algebra computation properties to merge operators in machine learning predictions and data processing, significantly accelerating predictive pipelines by up to 317x. We perform a complexity analysis to deliver quantitative insights into the advantages of operator fusion, considering various data and model dimensions. Furthermore, we extensively evaluate matrix multiplication query processing utilizing the widely-used Star Schema Benchmark. Through comprehensive evaluations, we demonstrate the effectiveness and potential of our approach in improving the efficiency of data processing and machine learning workloads on modern hardware.
Paper Structure (29 sections, 13 equations, 21 figures, 5 tables, 1 algorithm)

This paper contains 29 sections, 13 equations, 21 figures, 5 tables, 1 algorithm.

Figures (21)

  • Figure 1: Speedups of our operator fusion method in four experimental predictive pipelines. The baseline is cuDF without operator fusion. The maximum attainable speedup is 317.77x.
  • Figure 2: An example of projection as matrix multiplication.
  • Figure 3: An illustration for evaluating equi-join with matrix multiplication.
  • Figure 4: An illustration for evaluating group-by-sum with matrix multiplication.
  • Figure 5: Prediction with a decision tree with linear algebraic representation adapted from hummingbird1.
  • ...and 16 more figures