Table of Contents
Fetching ...

Deterministic $(1+\varepsilon)$-Approximate Maximum Matching with $\mathsf{poly}(1/\varepsilon)$ Passes in the Semi-Streaming Model and Beyond

Manuela Fischer, Slobodan Mitrović, Jara Uitto

TL;DR

This work presents a deterministic $(1+\varepsilon)$-approximation algorithm for maximum matching in the semi-streaming model with poly$(1/\varepsilon)$ passes and memory $n\cdot\text{poly}(1/\varepsilon)$. It introduces a phase-based Hopcroft-Karp-inspired approach augmented with structures for free nodes, on-hold mechanisms, and jumping to extend active paths, while handling odd cycles without blossom contraction. The paper also develops a general framework that simulates the streaming approach in CONGEST and MPC, achieving polynomial dependence on $1/\varepsilon$ and improving over prior exponential pass complexities. The results advance the theoretical understanding of graph streaming and distributed computation, providing a scalable path to near-optimal matching with deterministic guarantees.

Abstract

We present a deterministic $(1+\varepsilon)$-approximate maximum matching algorithm in $\mathsf{poly} 1/\varepsilon$ passes in the semi-streaming model, solving the long-standing open problem of breaking the exponential barrier in the dependence on $1/\varepsilon$. Our algorithm exponentially improves on the well-known randomized $(1/\varepsilon)^{O(1/\varepsilon)}$-pass algorithm from the seminal work by McGregor~[APPROX05], the recent deterministic algorithm by Tirodkar with the same pass complexity~[FSTTCS18]. Up to polynomial factors in $1/\varepsilon$, our work matches the state-of-the-art deterministic $(\log n / \log \log n) \cdot (1/\varepsilon)$-pass algorithm by Ahn and Guha~[TOPC18], that is allowed a dependence on the number of nodes $n$. Our result also makes progress on the Open Problem 60 at sublinear.info. Moreover, we design a general framework that simulates our approach for the streaming setting in other models of computation. This framework requires access to an algorithm computing an $O(1)$-approximate maximum matching and an algorithm for processing disjoint $(\mathsf{poly} 1 / \varepsilon)$-size connected components. Instantiating our framework in $\mathsf{CONGEST}$ yields a $\mathsf{poly}(\log{n}, 1/\varepsilon)$ round algorithm for computing $(1+\varepsilon$)-approximate maximum matching. In terms of the dependence on $1/\varepsilon$, this result improves exponentially state-of-the-art result by Lotker, Patt-Shamir, and Pettie~[LPSP15]. Our framework leads to the same quality of improvement in the context of the Massively Parallel Computation model as well.

Deterministic $(1+\varepsilon)$-Approximate Maximum Matching with $\mathsf{poly}(1/\varepsilon)$ Passes in the Semi-Streaming Model and Beyond

TL;DR

This work presents a deterministic -approximation algorithm for maximum matching in the semi-streaming model with poly passes and memory . It introduces a phase-based Hopcroft-Karp-inspired approach augmented with structures for free nodes, on-hold mechanisms, and jumping to extend active paths, while handling odd cycles without blossom contraction. The paper also develops a general framework that simulates the streaming approach in CONGEST and MPC, achieving polynomial dependence on and improving over prior exponential pass complexities. The results advance the theoretical understanding of graph streaming and distributed computation, providing a scalable path to near-optimal matching with deterministic guarantees.

Abstract

We present a deterministic -approximate maximum matching algorithm in passes in the semi-streaming model, solving the long-standing open problem of breaking the exponential barrier in the dependence on . Our algorithm exponentially improves on the well-known randomized -pass algorithm from the seminal work by McGregor~[APPROX05], the recent deterministic algorithm by Tirodkar with the same pass complexity~[FSTTCS18]. Up to polynomial factors in , our work matches the state-of-the-art deterministic -pass algorithm by Ahn and Guha~[TOPC18], that is allowed a dependence on the number of nodes . Our result also makes progress on the Open Problem 60 at sublinear.info. Moreover, we design a general framework that simulates our approach for the streaming setting in other models of computation. This framework requires access to an algorithm computing an -approximate maximum matching and an algorithm for processing disjoint -size connected components. Instantiating our framework in yields a round algorithm for computing )-approximate maximum matching. In terms of the dependence on , this result improves exponentially state-of-the-art result by Lotker, Patt-Shamir, and Pettie~[LPSP15]. Our framework leads to the same quality of improvement in the context of the Massively Parallel Computation model as well.

Paper Structure

This paper contains 64 sections, 29 theorems, 10 equations, 4 figures, 1 table, 7 algorithms.

Key Result

Theorem 1.1

Given a graph on $n$ vertices, there is a deterministic $(1+\varepsilon)$-approximation algorithm for maximum matching that runs in $\mathop{\mathrm{poly}}\nolimits(1/\varepsilon)$ passes in the semi-streaming model. Furthermore, the algorithm requires $n \cdot \mathop{\mathrm{poly}}\nolimits(1/\var

Figures (4)

  • Figure 1: Nodes $\alpha$, $\beta$, and $\gamma$ are free. The black single-segments are unmatched and black (full) double-segments are matched edges. The path $P'$ corresponding to a DFS branch of $\gamma$ is shown by the red solid spline. Since the edge $a_5$ is part of the path, the current DFS branch of $\gamma$ cannot be extended up to the free node $\beta$ along the dashed blue line. Furthermore, the path from $\gamma$ to the edge $a_3$ can potentially "block" a longer DFS search path of $\alpha$ illustrated with a solid blue line. However, the edges along the DFS searches of $\alpha$ and $\gamma$ can be combined to find an augmenting path between $\alpha$ and $\gamma$.
  • Figure 2: Assume that we first perform a DFS style search over the red (dashed) path from $\alpha$ to $a_3$. Moreover, assume that the algorithm maintains the (shortest) path length of the DFS as labels on edges, but not of arcs. In that case, this red path sets labels $\ell(a_1) = 1$, $\ell(a_2) = 2$ and $\ell(a_3) = 3$. Eventually, the search backtracks to $a_1$ and continues over $a_3$ setting the label to $\ell(a_3) = 2$. However, this DFS branch cannot continue to $a_4$, even though the path exists, due to the distance label $\ell(a_2) < \ell(a_3) + 1$. By considering labels on arcs, we allow $\alpha$ to extend its DFS search over $(\alpha, a_1, a_3, a_2, a_4)$.
  • Figure 3: In this example, $\alpha$ is a free node, black (full) single-segments are unmatched and black (full) double-segments are matched edges. Assume that the algorithm first explores the red (dashed) path from $\alpha$ to $a_4$. This red path sets labels $\ell(\overleftarrow{a_5}) = 1$ and $\ell(a_4) = 2$. After this exploration, in the next two passes $\alpha$ backtracks along the red path and sets the active path to be $(\alpha)$. Then, $\alpha$ continues extending its active path (over three $\textsc{Pass-Bundle}\xspace$s) along $a_1$, $a_2$ and $a_3$, setting labels $\ell(a_1) = 1$, $\ell(a_2) = 2$ and $\ell(a_3) = 3$, illustrated with the blue solid line. However, notice that $\ell(a_4) < 4$ and hence $\alpha$ does not extend the active path to become $(\alpha, a_1, a_2, a_3, a_4)$. On the other hand, $\ell(a_5) = \infty$ and our algorithm should enable $\alpha$ to reach $a_5$. This is achieved by enabling $\alpha$ to "jump" over $a_4$ and let the active path become $(\alpha, a_1, a_2, a_3, a_4, a_5)$, illustrated with the blue dashed line. We also set $\ell(a_5) \coloneqq 5$. (In this example, the arc $a_4$ plays role of $b_1$ in the description of Extend-Active-Paths.) Subsequently, in the next $\textsc{Pass-Bundle}\xspace$$\alpha$ will extend its active path to $(\alpha, a_1, a_2, a_3, a_4, a_5, a_6)$.
  • Figure 4: Illustrations for the inductive step of \ref{['lemma: non-active cycle']}. The upper picture illustrates the case where the alternating path $P(a_{i + 1}) = (b_1, \ldots, b_h)$ intersects the suffix $(c_1, \ldots, c_{h'})$ alternating path $P(\overleftarrow{a_{i + 1}})$ at some $\overleftarrow{b_j} = c_{j'}$. In this case, we can complete the prefix $(b_1, \ldots, b_{j})$ into a path to $\overleftarrow{a_1}$ using the prefix of $P(\overleftarrow{a_{i + 1}})$. In the lower, we consider the case that $b_j = c_{j'}$. In this case, we can complete the prefix $(b_1, \ldots, b_{j})$ into an alternating path to $\overleftarrow{a_1}$ using the suffix of $P(\overleftarrow{a_{i + 1}})$ and the path $\overleftarrow{a_1, \ldots, a_{i + 1}}$. In the proof, we show that $(b_1, \ldots, b_{j - 1}) \circ (c_{j'}, \ldots c_{h'})$ cannot intersect the path $(a_1, \ldots, a_{i + 1})$.

Theorems & Definitions (61)

  • Theorem 1.1
  • Theorem 1.2: Restatement of \ref{['theorem:framework']}
  • Definition 2.1: An Unmatched Edge and a Free Node
  • Definition 2.2: Alternating Path, Alternating Length, Alternating Distance
  • Definition 2.3: Concatenation of Alternating Paths
  • Definition 2.4: Path Reverse
  • Definition 2.5: The Label of an Arc
  • Definition 2.6: Structure of a Free Node
  • Definition 2.8: Active and Inactive Arcs and Vertices
  • Definition 2.9: Removing-directions operator
  • ...and 51 more