Table of Contents
Fetching ...

When Can We Solve the Weighted Low Rank Approximation Problem in Truly Subquadratic Time?

Chenyang Li, Yingyu Liang, Zhenmei Shi, Zhao Song

TL;DR

This work addresses the dense weighted low-rank approximation problem by introducing a regime where near-linear time is possible. The authors leverage a combination of sketching, semi-algebraic geometry, and a structured low-rank factorization approach that exploits a small number of distinct rows and columns in the weight matrix W and in W circ A. They establish both lower and upper bounds for the optimal cost and develop a decision-problem framework that enables solving for a rank-k factorization in time n^{1+o(1)} under conditions k^2 r = O(log n / log log n) and p = n^{o(1)}, with high probability. The results imply true subquadratic time for a meaningful dense regime and provide a pathway for recovering U and V efficiently, with broader implications for attention mechanisms and scalable matrix factorization. Overall, the paper advances the theoretical understanding of when weighted low-rank approximation can be solved far faster than naively expected and outlines practical avenues for extending these ideas.

Abstract

The weighted low-rank approximation problem is a fundamental numerical linear algebra problem and has many applications in machine learning. Given a $n \times n$ weight matrix $W$ and a $n \times n$ matrix $A$, the goal is to find two low-rank matrices $U, V \in \mathbb{R}^{n \times k}$ such that the cost of $\| W \circ (U V^\top - A) \|_F^2$ is minimized. Previous work has to pay $Ω(n^2)$ time when matrices $A$ and $W$ are dense, e.g., having $Ω(n^2)$ non-zero entries. In this work, we show that there is a certain regime, even if $A$ and $W$ are dense, we can still hope to solve the weighted low-rank approximation problem in almost linear $n^{1+o(1)}$ time.

When Can We Solve the Weighted Low Rank Approximation Problem in Truly Subquadratic Time?

TL;DR

This work addresses the dense weighted low-rank approximation problem by introducing a regime where near-linear time is possible. The authors leverage a combination of sketching, semi-algebraic geometry, and a structured low-rank factorization approach that exploits a small number of distinct rows and columns in the weight matrix W and in W circ A. They establish both lower and upper bounds for the optimal cost and develop a decision-problem framework that enables solving for a rank-k factorization in time n^{1+o(1)} under conditions k^2 r = O(log n / log log n) and p = n^{o(1)}, with high probability. The results imply true subquadratic time for a meaningful dense regime and provide a pathway for recovering U and V efficiently, with broader implications for attention mechanisms and scalable matrix factorization. Overall, the paper advances the theoretical understanding of when weighted low-rank approximation can be solved far faster than naively expected and outlines practical avenues for extending these ideas.

Abstract

The weighted low-rank approximation problem is a fundamental numerical linear algebra problem and has many applications in machine learning. Given a weight matrix and a matrix , the goal is to find two low-rank matrices such that the cost of is minimized. Previous work has to pay time when matrices and are dense, e.g., having non-zero entries. In this work, we show that there is a certain regime, even if and are dense, we can still hope to solve the weighted low-rank approximation problem in almost linear time.

Paper Structure

This paper contains 20 sections, 10 theorems, 61 equations.

Key Result

Theorem 1.2

Let $A$ and $W$ denote two $n \times n$ matrices. Assume each entry of $A$ and $W$ needs $n^{\gamma}$ bitsEach entry in a matrix can be represented by $n^\gamma$ bits. In a real dataset, if we use the float32 format, then as long as $n^\gamma \ge 32$, our assumption holds. to represent, where $\gamm with probability at least $0.99$, where

Theorems & Definitions (19)

  • Definition 1.1
  • Theorem 1.2: Main result
  • Definition 3.1: Distinct rows and columns
  • Lemma 3.3: Cramer's rule
  • Theorem 3.4: Decision Problem r92ar92bbpr96
  • Theorem 3.5: jpt13
  • Definition 3.6
  • Lemma 3.7: folklore
  • proof
  • Lemma 3.8: Theorem 3.1 in rsw16
  • ...and 9 more