Sublinear Time Algorithm for Online Weighted Bipartite Matching
Hang Hu, Zhao Song, Runzhou Tao, Zhaozhuo Xu, Junze Yin, Danyang Zhuo
TL;DR
This paper tackles online weighted bipartite matching where edge weights are given by inner products of $d$-dimensional feature vectors, a setting common in recommendation and ranking systems with large item sets. It develops randomized data structures that approximate weights, enabling sublinear-time weight computations per arriving online vertex while preserving a $\frac{1}{2}$-approximation ratio via a robust greedy framework. The authors introduce a dynamic inner-product estimation structure and, for the distance and inner-product variants, derive per-arrival time bounds of $\widetilde{O}(\epsilon^{-2}(n+d)\log(n/\delta))$ and $\widetilde{O}(\epsilon^{-2}D^2(n+d)\log(n/\delta))$ respectively, with high-probability guarantees; they further improve the inner-product case using a Max-IP-based approach to achieve sublinear updates with a tunable exponent. The work demonstrates that sublinear-time online matching is feasible in a meaningful subset of the problem space, offering practical latency improvements for large-scale systems while maintaining rigorous competitive guarantees.
Abstract
Online bipartite matching is a fundamental problem in online algorithms. The goal is to match two sets of vertices to maximize the sum of the edge weights, where for one set of vertices, each vertex and its corresponding edge weights appear in a sequence. Currently, in the practical recommendation system or search engine, the weights are decided by the inner product between the deep representation of a user and the deep representation of an item. The standard online matching needs to pay $nd$ time to linear scan all the $n$ items, computing weight (assuming each representation vector has length $d$), and then deciding the matching based on the weights. However, in reality, the $n$ could be very large, e.g. in online e-commerce platforms. Thus, improving the time of computing weights is a problem of practical significance. In this work, we provide the theoretical foundation for computing the weights approximately. We show that, with our proposed randomized data structures, the weights can be computed in sublinear time while still preserving the competitive ratio of the matching algorithm.
