Table of Contents
Fetching ...

Bipartite Matching in Massive Graphs: A Tight Analysis of EDCS

Amir Azarmehr, Soheil Behnezhad, Mohammad Roghani

TL;DR

A new approach for analyzing the approximation ratio of EDCS is proposed, which pinpoint the precise approximation ratio of EDCS for any sparsity parameter $\beta$ and reveals that one does not necessarily need to increase $\beta$ to improve approximation, as suggested by previous analysis.

Abstract

Maximum matching is one of the most fundamental combinatorial optimization problems with applications in various contexts such as balanced clustering, data mining, resource allocation, and online advertisement. In many of these applications, the input graph is massive. The sheer size of these inputs makes it impossible to store the whole graph in the memory of a single machine and process it there. Graph sparsification has been an extremely powerful tool to alleviate this problem. In this paper, we study a highly successful and versatile sparsifier for the matching problem: the *edge-degree constrained subgraph (EDCS)* introduced first by Bernstein and Stein [ICALP'15]. The EDCS has a parameter $β\geq 2$ which controls the density of the sparsifier. It has been shown through various proofs in the literature that by picking a subgraph with $O(nβ)$ edges, the EDCS includes a matching of size at least $2/3-O(1/β)$ times the maximum matching size. As such, by increasing $β$ the approximation ratio of EDCS gets closer and closer to $2/3$. In this paper, we propose a new approach for analyzing the approximation ratio of EDCS. Our analysis is *tight* for any value of $β$. Namely, we pinpoint the precise approximation ratio of EDCS for any sparsity parameter $β$. Our analysis reveals that one does not necessarily need to increase $β$ to improve approximation, as suggested by previous analysis. In particular, the best choice turns out to be $β= 6$, which achieves an approximation ratio of $.677$! This is arguably surprising as it is even better than $2/3 \sim .666$, the bound that was widely believed to be the limit for EDCS.

Bipartite Matching in Massive Graphs: A Tight Analysis of EDCS

TL;DR

A new approach for analyzing the approximation ratio of EDCS is proposed, which pinpoint the precise approximation ratio of EDCS for any sparsity parameter and reveals that one does not necessarily need to increase to improve approximation, as suggested by previous analysis.

Abstract

Maximum matching is one of the most fundamental combinatorial optimization problems with applications in various contexts such as balanced clustering, data mining, resource allocation, and online advertisement. In many of these applications, the input graph is massive. The sheer size of these inputs makes it impossible to store the whole graph in the memory of a single machine and process it there. Graph sparsification has been an extremely powerful tool to alleviate this problem. In this paper, we study a highly successful and versatile sparsifier for the matching problem: the *edge-degree constrained subgraph (EDCS)* introduced first by Bernstein and Stein [ICALP'15]. The EDCS has a parameter which controls the density of the sparsifier. It has been shown through various proofs in the literature that by picking a subgraph with edges, the EDCS includes a matching of size at least times the maximum matching size. As such, by increasing the approximation ratio of EDCS gets closer and closer to . In this paper, we propose a new approach for analyzing the approximation ratio of EDCS. Our analysis is *tight* for any value of . Namely, we pinpoint the precise approximation ratio of EDCS for any sparsity parameter . Our analysis reveals that one does not necessarily need to increase to improve approximation, as suggested by previous analysis. In particular, the best choice turns out to be , which achieves an approximation ratio of ! This is arguably surprising as it is even better than , the bound that was widely believed to be the limit for EDCS.
Paper Structure (10 sections, 4 theorems, 5 equations, 5 figures, 2 tables)

This paper contains 10 sections, 4 theorems, 5 equations, 5 figures, 2 tables.

Key Result

Proposition 1.2

Any graph $G$ contains a $(\beta, \beta^-)$-EDCS for any integers $\beta > \beta^- \geq 1$, and one can be found greedily in polynomial time.

Figures (5)

  • Figure 1: An example due to BernsteinS15 where a $(\beta, \beta-1)$-EDCS does not obtain a better than 2/3-approximation for any odd $\beta = 2k+1$. Here we have a bipartite graph, each side having three equally sized vertex sets $A, B, C$. The solid blue lines denote the EDCS edges, and the dashed red lines denote the edges not in the EDCS. The vertices in $B$ have degree $k$ in the EDCS, the vertices in $C$ have degree $k+1$ in the EDCS, and the vertices in $A$ have degree one. Note that any edge in the EDCS has edge degree at most $\beta = 2k+1$. The only edges missing from the EDCS are the dashed $B$-$B$ edges which all have edge degree exactly $\beta - 1 = 2k$. While the graph $G$ has a perfect matching, the EDCS can only match $2/3$ of the vertices.
  • Figure 2: An example of Hall's witness. The curvy red edges denote a maximum matching, and $A$ is a Hall's witness. Generally, considering a vertex set $A \subseteq L$ and any matching, the vertices in $A$ are matched to a subset of $N(A)$. Therefore, even if all the vertices in $L \setminus A$ are somehow matched, the matching has size at most $\left\lvert N(A)\right\rvert + \left\lvert L\setminus A\right\rvert$, i.e. $\mu(G) \leq \left\lvert N(A)\right\rvert + \left\lvert L\setminus A\right\rvert$. Hall's theorem states that there exists a vertex set $A$, referred to as a Hall's witness, for which this inequality is tight.
  • Figure 3: Heatmap shows the approximation ratio of $(\beta, \beta^{-})$-EDCS for different values of $\beta^{-}$ (x-axis) and $\beta$ (y-axis). The best approximation ratio is shown in a green circle for $\beta = 6$ and $\beta^{-} = 5$.
  • Figure 4: The approximation ratio of $(\beta, \beta-1)$-EDCS is computed for all possible values of $\beta \leq 100$. The y-axis denotes the approximation ratio, while the x-axis corresponds to the values of $\beta$. The horizontal red dashed line represents the approximation ratio $2/3$, which was previously believed to be the best possible approximation ratio of EDCS. The largest approximation ratio occurs when $\beta = 6$.
  • Figure 5: The approximation ratio of $(\beta, \beta-c)$-EDCS is calculated for various choices of $c \in [1, 2, 5, 10]$ across all feasible values of $\beta \leq 100$. The y-axis represents the approximation ratio, and the x-axis corresponds to the values of $\beta$. The horizontal red dashed line represents the approximation ratio $2/3$. As $c$ increases, the approximation ratio worsens, and the convergence to 2/3 occurs at a slower rate.

Theorems & Definitions (9)

  • Definition 1.1: BernsteinS15
  • Proposition 1.2: BernsteinS15
  • Proposition 1.3: BernsteinS15BernsteinS16AssadiB19corr/Behnezhad21
  • Proposition 2.1: Extended Hall's Theorem
  • Theorem 1
  • Claim 3.1
  • proof
  • Claim 3.2
  • proof