Sublinear Time Algorithm for Online Weighted Bipartite Matching

Hang Hu; Zhao Song; Runzhou Tao; Zhaozhuo Xu; Junze Yin; Danyang Zhuo

Sublinear Time Algorithm for Online Weighted Bipartite Matching

Hang Hu, Zhao Song, Runzhou Tao, Zhaozhuo Xu, Junze Yin, Danyang Zhuo

TL;DR

This paper tackles online weighted bipartite matching where edge weights are given by inner products of $d$-dimensional feature vectors, a setting common in recommendation and ranking systems with large item sets. It develops randomized data structures that approximate weights, enabling sublinear-time weight computations per arriving online vertex while preserving a $\frac{1}{2}$-approximation ratio via a robust greedy framework. The authors introduce a dynamic inner-product estimation structure and, for the distance and inner-product variants, derive per-arrival time bounds of $\widetilde{O}(\epsilon^{-2}(n+d)\log(n/\delta))$ and $\widetilde{O}(\epsilon^{-2}D^2(n+d)\log(n/\delta))$ respectively, with high-probability guarantees; they further improve the inner-product case using a Max-IP-based approach to achieve sublinear updates with a tunable exponent. The work demonstrates that sublinear-time online matching is feasible in a meaningful subset of the problem space, offering practical latency improvements for large-scale systems while maintaining rigorous competitive guarantees.

Abstract

Online bipartite matching is a fundamental problem in online algorithms. The goal is to match two sets of vertices to maximize the sum of the edge weights, where for one set of vertices, each vertex and its corresponding edge weights appear in a sequence. Currently, in the practical recommendation system or search engine, the weights are decided by the inner product between the deep representation of a user and the deep representation of an item. The standard online matching needs to pay $nd$ time to linear scan all the $n$ items, computing weight (assuming each representation vector has length $d$), and then deciding the matching based on the weights. However, in reality, the $n$ could be very large, e.g. in online e-commerce platforms. Thus, improving the time of computing weights is a problem of practical significance. In this work, we provide the theoretical foundation for computing the weights approximately. We show that, with our proposed randomized data structures, the weights can be computed in sublinear time while still preserving the competitive ratio of the matching algorithm.

Sublinear Time Algorithm for Online Weighted Bipartite Matching

TL;DR

This paper tackles online weighted bipartite matching where edge weights are given by inner products of

-dimensional feature vectors, a setting common in recommendation and ranking systems with large item sets. It develops randomized data structures that approximate weights, enabling sublinear-time weight computations per arriving online vertex while preserving a

-approximation ratio via a robust greedy framework. The authors introduce a dynamic inner-product estimation structure and, for the distance and inner-product variants, derive per-arrival time bounds of

and

respectively, with high-probability guarantees; they further improve the inner-product case using a Max-IP-based approach to achieve sublinear updates with a tunable exponent. The work demonstrates that sublinear-time online matching is feasible in a meaningful subset of the problem space, offering practical latency improvements for large-scale systems while maintaining rigorous competitive guarantees.

Abstract

time to linear scan all the

items, computing weight (assuming each representation vector has length

), and then deciding the matching based on the weights. However, in reality, the

could be very large, e.g. in online e-commerce platforms. Thus, improving the time of computing weights is a problem of practical significance. In this work, we provide the theoretical foundation for computing the weights approximately. We show that, with our proposed randomized data structures, the weights can be computed in sublinear time while still preserving the competitive ratio of the matching algorithm.

Paper Structure (36 sections, 29 theorems, 41 equations, 6 algorithms)

This paper contains 36 sections, 29 theorems, 41 equations, 6 algorithms.

Introduction
Roadmap
Our Results
Related Work
Online Weighted Bipartite Matching.
Data Structures for Machine Learning.
Preliminaries
Notations.
Locality Sensitive Hashing
Other Useful Concepts
Online Weight Bipartite Matching With Approximate Weight
Our Online Matching Problem
Greedy Algorithm is 1/2-approximation
Approximate Weight Function Implies Slightly Worse Approximate Ratio
Dynamic Data Structure
...and 21 more sections

Key Result

Theorem 1.2

Let $\epsilon \in (0,1), \delta \in (0,1)$. Then, there is an online bipartite matching algorithm (with $w(x,y)=\| x- y\|_2$) that takes time for each coming vertex, and our algorithm satisfies with probability at least $1-\delta$, where $n$ is the number of online points.

Theorems & Definitions (56)

Definition 1.1: Informal version of Definition \ref{['def:main_problem']}
Theorem 1.2: Main result, informal version of Theorem \ref{['thm:distance_weight']}
Theorem 1.3: Main result, informal version of Theorem \ref{['thm:inner_product_weight']}
Theorem 1.4: Main result, informal version of Theorem \ref{['thm:inner_product_weight_v2']}
Definition 2.1: Locality sensitive hashing, im98
Definition 2.2: Approximate Near Neighbor
Definition 2.3: Approximate Max-IP
Theorem 2.4: Andoni and Razenshteyn ar15
Theorem 2.5: Theorem 8.2, page 19, ssx21
Definition 2.6: Submodular function s03
...and 46 more

Sublinear Time Algorithm for Online Weighted Bipartite Matching

TL;DR

Abstract

Sublinear Time Algorithm for Online Weighted Bipartite Matching

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (56)