Table of Contents
Fetching ...

Optimal Approximate Matrix Multiplication over Sliding Windows

Ziqi Yao, Mingsong Chen, Cheng Chen

TL;DR

This work addresses AMM under sliding-window constraints, developing deterministic DS-COD to achieve optimal space–error tradeoffs for both sequence-based and time-based windows. It introduces a hierarchy of snapshot-based maintenance (hDS-COD) and a space-efficient adaptive variant (aDS-COD) that dynamically adjusts thresholds, with a rigorous lower-bound analysis showing optimality. The methods are backed by theoretical guarantees on error and space, plus comprehensive experiments on synthetic and real data demonstrating practical efficiency and accuracy gains over prior sliding-window AMM approaches. The results have implications for time-sensitive matrix computations in streaming data tasks, PCA-like workflows, and online ML pipelines where keeping only recent information is essential.

Abstract

We explore the problem of approximate matrix multiplication (AMM) within the sliding window model, where algorithms utilize limited space to perform large-scale matrix multiplication in a streaming manner. This model has garnered increasing attention in the fields of machine learning and data mining due to its ability to handle time sensitivity and reduce the impact of outdated data. However, despite recent advancements, determining the optimal space bound for this problem remains an open question. In this paper, we introduce the DS-COD algorithm for AMM over sliding windows. This novel and deterministic algorithm achieves optimal performance regarding the space-error tradeoff. We provide theoretical error bounds and the complexity analysis for the proposed algorithm, and establish the corresponding space lower bound for the AMM sliding window problem. Additionally, we present an adaptive version of DS-COD, termed aDS-COD, which improves computational efficiency and demonstrates superior empirical performance. Extensive experiments conducted on both synthetic and real-world datasets validate our theoretical findings and highlight the practical effectiveness of our methods.

Optimal Approximate Matrix Multiplication over Sliding Windows

TL;DR

This work addresses AMM under sliding-window constraints, developing deterministic DS-COD to achieve optimal space–error tradeoffs for both sequence-based and time-based windows. It introduces a hierarchy of snapshot-based maintenance (hDS-COD) and a space-efficient adaptive variant (aDS-COD) that dynamically adjusts thresholds, with a rigorous lower-bound analysis showing optimality. The methods are backed by theoretical guarantees on error and space, plus comprehensive experiments on synthetic and real data demonstrating practical efficiency and accuracy gains over prior sliding-window AMM approaches. The results have implications for time-sensitive matrix computations in streaming data tasks, PCA-like workflows, and online ML pipelines where keeping only recent information is essential.

Abstract

We explore the problem of approximate matrix multiplication (AMM) within the sliding window model, where algorithms utilize limited space to perform large-scale matrix multiplication in a streaming manner. This model has garnered increasing attention in the fields of machine learning and data mining due to its ability to handle time sensitivity and reduce the impact of outdated data. However, despite recent advancements, determining the optimal space bound for this problem remains an open question. In this paper, we introduce the DS-COD algorithm for AMM over sliding windows. This novel and deterministic algorithm achieves optimal performance regarding the space-error tradeoff. We provide theoretical error bounds and the complexity analysis for the proposed algorithm, and establish the corresponding space lower bound for the AMM sliding window problem. Additionally, we present an adaptive version of DS-COD, termed aDS-COD, which improves computational efficiency and demonstrates superior empirical performance. Extensive experiments conducted on both synthetic and real-world datasets validate our theoretical findings and highlight the practical effectiveness of our methods.

Paper Structure

This paper contains 28 sections, 6 theorems, 14 equations, 2 figures, 2 tables, 6 algorithms.

Key Result

Lemma 1

Let $[{\bf Q}_x,{\bf R}_x]=\operatorname{QR}({\bf A})$, $[{\bf Q}_y,{\bf R}_y]=\operatorname{QR}({\bf B})$ , $[{\bf U}, \mathbf{\Sigma}, {\bf V}] = \operatorname{SVD}({\bf R}_x {\bf R}_y^\top)$, ${\bf C} = {\bf Q}_x {\bf U} \sqrt{\mathbf{\Sigma}}$ and ${\bf D} = {\bf Q}_y {\bf V} \sqrt{\mathbf{\Sigm

Figures (2)

  • Figure 1: The illustration of Dump Snapshots Co-occurring Directions.
  • Figure 2: The plot of correlation error against final sketch size.

Theorems & Definitions (8)

  • Definition 3.1: mroueh2017co
  • Definition 3.2
  • Lemma 1
  • Theorem 1
  • Corollary 4.1
  • Lemma 2
  • Theorem 2
  • Theorem 3