Semi-Streaming Algorithms for Weighted $k$-Disjoint Matchings

S M Ferdous; Bhargav Samineni; Alex Pothen; Mahantesh Halappanavar; Bala Krishnamoorthy

Semi-Streaming Algorithms for Weighted $k$-Disjoint Matchings

S M Ferdous, Bhargav Samineni, Alex Pothen, Mahantesh Halappanavar, Bala Krishnamoorthy

TL;DR

The paper tackles weighted $k$-Disjoint Matching ($k$-DM) in the semi-streaming setting, where NP-hardness for $k\ge 2$ motivates scalable, memory-efficient approaches. It introduces two semi-streaming strategies: a primal-dual LP-based method achieving a $\frac{1}{3+\varepsilon}$-approximation and an $O(nk\log^2 n)$ space bound, and a reduction to MW$\!$b$-M that yields a $(\frac{1}{2+\varepsilon})(1-\frac{1}{k+1})$-approximation by combining a semi-streaming MW$b$M with an edge-coloring step. The authors prove dual feasibility, derive approximation guarantees, and analyze time/space complexities, with the practical benefit of drastically reduced memory and faster runtime compared to offline algorithms on large graphs including data-center network traces. Experimental results on 95 graphs show memory reductions from 6x to 512x and near-best solution quality (within about 5% of offline baselines), demonstrating the practical viability of semi-streaming $k$-DM for massive graphs. The work thus advances scalable optimization for reconfigurable networks and other large-scale systems where exact offline methods are impractical.

Abstract

We design and implement two single-pass semi-streaming algorithms for the maximum weight $k$-disjoint matching ($k$-DM) problem. Given an integer $k$, the $k$-DM problem is to find $k$ pairwise edge-disjoint matchings such that the sum of the weights of the matchings is maximized. For $k \geq 2$, this problem is NP-hard. Our first algorithm is based on the primal-dual framework of a linear programming relaxation of the problem and is $\frac{1}{3+\varepsilon}$-approximate. We also develop an approximation preserving reduction from $k$-DM to the maximum weight $b$-matching problem. Leveraging this reduction and an existing semi-streaming $b$-matching algorithm, we design a $(\frac{1}{2+\varepsilon})(1 - \frac{1}{k+1})$-approximate semi-streaming algorithm for $k$-DM. For any constant $\varepsilon > 0$, both of these algorithms require $O(nk \log_{1+\varepsilon}^2 n)$ bits of space. To the best of our knowledge, this is the first study of semi-streaming algorithms for the $k$-DM problem. We compare our two algorithms to state-of-the-art offline algorithms on 95 real-world and synthetic test problems, including thirteen graphs generated from data center network traces. On these instances, our streaming algorithms used significantly less memory (ranging from 6$\times$ to 512$\times$ less) and were faster in runtime than the offline algorithms. Our solutions were often within 5% of the best weights from the offline algorithms. We highlight that the existing offline algorithms run out of 1 TB memory for most of the large instances ($>1$ billion edges), whereas our streaming algorithms can solve these problems using only 100 GB memory for $k=8$.

Semi-Streaming Algorithms for Weighted $k$-Disjoint Matchings

TL;DR

The paper tackles weighted

-Disjoint Matching (

-DM) in the semi-streaming setting, where NP-hardness for

motivates scalable, memory-efficient approaches. It introduces two semi-streaming strategies: a primal-dual LP-based method achieving a

-approximation and an

space bound, and a reduction to MW

(\frac{1}{2+\varepsilon})(1-\frac{1}{k+1})

k$-DM for massive graphs. The work thus advances scalable optimization for reconfigurable networks and other large-scale systems where exact offline methods are impractical.

Abstract

We design and implement two single-pass semi-streaming algorithms for the maximum weight

-disjoint matching (

-DM) problem. Given an integer

, the

-DM problem is to find

pairwise edge-disjoint matchings such that the sum of the weights of the matchings is maximized. For

, this problem is NP-hard. Our first algorithm is based on the primal-dual framework of a linear programming relaxation of the problem and is

-approximate. We also develop an approximation preserving reduction from

-DM to the maximum weight

-matching problem. Leveraging this reduction and an existing semi-streaming

-matching algorithm, we design a

-approximate semi-streaming algorithm for

-DM. For any constant

, both of these algorithms require

bits of space. To the best of our knowledge, this is the first study of semi-streaming algorithms for the

-DM problem. We compare our two algorithms to state-of-the-art offline algorithms on 95 real-world and synthetic test problems, including thirteen graphs generated from data center network traces. On these instances, our streaming algorithms used significantly less memory (ranging from 6

to 512

less) and were faster in runtime than the offline algorithms. Our solutions were often within 5% of the best weights from the offline algorithms. We highlight that the existing offline algorithms run out of 1 TB memory for most of the large instances (

billion edges), whereas our streaming algorithms can solve these problems using only 100 GB memory for

Paper Structure (43 sections, 6 theorems, 11 equations, 9 figures, 4 tables, 5 algorithms)

This paper contains 43 sections, 6 theorems, 11 equations, 9 figures, 4 tables, 5 algorithms.

Introduction
Algorithmic Contributions
Experimental Validation
Preliminaries
Notation
Matchings and $b$-Matchings
$k$-Disjoint Matchings
Semi-Streaming Model
Related Work
Offline Approximation Algorithms
Matchings in the Semi-Streaming Model
Edge Colorings and Unweighted $k$-DM
A Primal-Dual Approach
Analysis of the Algorithm
Dual Feasibility
...and 28 more sections

Key Result

lemma 1

The solution $\mathcal{M}$ output by alg:stk-djm satisfies $w(\mathcal{M}) \geq \frac{1}{2} \sum_{c \in [k]} \sum_{v \in V} \phi(c, v)$.

Figures (9)

Figure 1: LP Relaxation \ref{['primal-k-djm']} of $k$-DM and its dual \ref{['dual-k-djm']}.
Figure 2: Summary plots for Small instances on different streaming algorithms with $\varepsilon = 0.001$. Plot (a) is a boxplot of relative weights across all instances and $k$ values for each algorithm. Plots (b) and (c) give the geometric mean of the relative time and memory, respectively, across all instances with increasing $k$ values. Stk is the baseline algorithm for relative time and memory, while Stk-dp is the baseline for relative weight.
Figure 3: Summary plots the streaming and offline algorithms on Small dataset with $\varepsilon = 0.001$ for the streaming algorithms. Plot (a) is a boxplot of relative weights across all instances and $k$ values for each algorithm. Plots (b) and (c) give the geometric mean of the relative time and memory, respectively, across all instances with increasing $k$ values. Stk is the baseline algorithm for relative time and memory, while GPA-It is the baseline for relative weight.
Figure 4: Summary plots the streaming and offline algorithms on Facebook Trace dataset with $\varepsilon = 0.001$ for the streaming algorithms. Plot (a) is a boxplot of relative weights across all instances and $k$ values for each algorithm. Plots (b) and (c) give the geometric mean of the relative time and memory, respectively, across all instances with increasing $k$ values. Stk is the baseline algorithm for relative time and memory, while GPA-It is the baseline for relative weight.
Figure 5: LP Relaxation \ref{['primal-mwm']} of MWM and its dual \ref{['dual-mwm']}.
...and 4 more figures

Theorems & Definitions (12)

lemma 1
proof
lemma 2
proof
theorem 1
proof
proof
lemma 3
proof
lemma 4
...and 2 more

Semi-Streaming Algorithms for Weighted $k$-Disjoint Matchings

TL;DR

Abstract

Semi-Streaming Algorithms for Weighted $k$-Disjoint Matchings

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (9)

Theorems & Definitions (12)