Table of Contents
Fetching ...

Semi-Streaming Algorithms for Weighted $k$-Disjoint Matchings

S M Ferdous, Bhargav Samineni, Alex Pothen, Mahantesh Halappanavar, Bala Krishnamoorthy

TL;DR

The paper tackles weighted $k$-Disjoint Matching ($k$-DM) in the semi-streaming setting, where NP-hardness for $k\ge 2$ motivates scalable, memory-efficient approaches. It introduces two semi-streaming strategies: a primal-dual LP-based method achieving a $\frac{1}{3+\varepsilon}$-approximation and an $O(nk\log^2 n)$ space bound, and a reduction to MW$\!$b$-M that yields a $(\frac{1}{2+\varepsilon})(1-\frac{1}{k+1})$-approximation by combining a semi-streaming MW$b$M with an edge-coloring step. The authors prove dual feasibility, derive approximation guarantees, and analyze time/space complexities, with the practical benefit of drastically reduced memory and faster runtime compared to offline algorithms on large graphs including data-center network traces. Experimental results on 95 graphs show memory reductions from 6x to 512x and near-best solution quality (within about 5% of offline baselines), demonstrating the practical viability of semi-streaming $k$-DM for massive graphs. The work thus advances scalable optimization for reconfigurable networks and other large-scale systems where exact offline methods are impractical.

Abstract

We design and implement two single-pass semi-streaming algorithms for the maximum weight $k$-disjoint matching ($k$-DM) problem. Given an integer $k$, the $k$-DM problem is to find $k$ pairwise edge-disjoint matchings such that the sum of the weights of the matchings is maximized. For $k \geq 2$, this problem is NP-hard. Our first algorithm is based on the primal-dual framework of a linear programming relaxation of the problem and is $\frac{1}{3+\varepsilon}$-approximate. We also develop an approximation preserving reduction from $k$-DM to the maximum weight $b$-matching problem. Leveraging this reduction and an existing semi-streaming $b$-matching algorithm, we design a $(\frac{1}{2+\varepsilon})(1 - \frac{1}{k+1})$-approximate semi-streaming algorithm for $k$-DM. For any constant $\varepsilon > 0$, both of these algorithms require $O(nk \log_{1+\varepsilon}^2 n)$ bits of space. To the best of our knowledge, this is the first study of semi-streaming algorithms for the $k$-DM problem. We compare our two algorithms to state-of-the-art offline algorithms on 95 real-world and synthetic test problems, including thirteen graphs generated from data center network traces. On these instances, our streaming algorithms used significantly less memory (ranging from 6$\times$ to 512$\times$ less) and were faster in runtime than the offline algorithms. Our solutions were often within 5% of the best weights from the offline algorithms. We highlight that the existing offline algorithms run out of 1 TB memory for most of the large instances ($>1$ billion edges), whereas our streaming algorithms can solve these problems using only 100 GB memory for $k=8$.

Semi-Streaming Algorithms for Weighted $k$-Disjoint Matchings

TL;DR

The paper tackles weighted -Disjoint Matching (-DM) in the semi-streaming setting, where NP-hardness for motivates scalable, memory-efficient approaches. It introduces two semi-streaming strategies: a primal-dual LP-based method achieving a -approximation and an space bound, and a reduction to MWb(\frac{1}{2+\varepsilon})(1-\frac{1}{k+1})bk$-DM for massive graphs. The work thus advances scalable optimization for reconfigurable networks and other large-scale systems where exact offline methods are impractical.

Abstract

We design and implement two single-pass semi-streaming algorithms for the maximum weight -disjoint matching (-DM) problem. Given an integer , the -DM problem is to find pairwise edge-disjoint matchings such that the sum of the weights of the matchings is maximized. For , this problem is NP-hard. Our first algorithm is based on the primal-dual framework of a linear programming relaxation of the problem and is -approximate. We also develop an approximation preserving reduction from -DM to the maximum weight -matching problem. Leveraging this reduction and an existing semi-streaming -matching algorithm, we design a -approximate semi-streaming algorithm for -DM. For any constant , both of these algorithms require bits of space. To the best of our knowledge, this is the first study of semi-streaming algorithms for the -DM problem. We compare our two algorithms to state-of-the-art offline algorithms on 95 real-world and synthetic test problems, including thirteen graphs generated from data center network traces. On these instances, our streaming algorithms used significantly less memory (ranging from 6 to 512 less) and were faster in runtime than the offline algorithms. Our solutions were often within 5% of the best weights from the offline algorithms. We highlight that the existing offline algorithms run out of 1 TB memory for most of the large instances ( billion edges), whereas our streaming algorithms can solve these problems using only 100 GB memory for .
Paper Structure (43 sections, 6 theorems, 11 equations, 9 figures, 4 tables, 5 algorithms)

This paper contains 43 sections, 6 theorems, 11 equations, 9 figures, 4 tables, 5 algorithms.

Key Result

lemma 1

The solution $\mathcal{M}$ output by alg:stk-djm satisfies $w(\mathcal{M}) \geq \frac{1}{2} \sum_{c \in [k]} \sum_{v \in V} \phi(c, v)$.

Figures (9)

  • Figure 1: LP Relaxation \ref{['primal-k-djm']} of $k$-DM and its dual \ref{['dual-k-djm']}.
  • Figure 2: Summary plots for Small instances on different streaming algorithms with $\varepsilon = 0.001$. Plot (a) is a boxplot of relative weights across all instances and $k$ values for each algorithm. Plots (b) and (c) give the geometric mean of the relative time and memory, respectively, across all instances with increasing $k$ values. Stk is the baseline algorithm for relative time and memory, while Stk-dp is the baseline for relative weight.
  • Figure 3: Summary plots the streaming and offline algorithms on Small dataset with $\varepsilon = 0.001$ for the streaming algorithms. Plot (a) is a boxplot of relative weights across all instances and $k$ values for each algorithm. Plots (b) and (c) give the geometric mean of the relative time and memory, respectively, across all instances with increasing $k$ values. Stk is the baseline algorithm for relative time and memory, while GPA-It is the baseline for relative weight.
  • Figure 4: Summary plots the streaming and offline algorithms on Facebook Trace dataset with $\varepsilon = 0.001$ for the streaming algorithms. Plot (a) is a boxplot of relative weights across all instances and $k$ values for each algorithm. Plots (b) and (c) give the geometric mean of the relative time and memory, respectively, across all instances with increasing $k$ values. Stk is the baseline algorithm for relative time and memory, while GPA-It is the baseline for relative weight.
  • Figure 5: LP Relaxation \ref{['primal-mwm']} of MWM and its dual \ref{['dual-mwm']}.
  • ...and 4 more figures

Theorems & Definitions (12)

  • lemma 1
  • proof
  • lemma 2
  • proof
  • theorem 1
  • proof
  • proof
  • lemma 3
  • proof
  • lemma 4
  • ...and 2 more