Table of Contents
Fetching ...

Mini-batch Submodular Maximization

Gregory Schwartzman

TL;DR

This work tackles maximizing a non-negative monotone decomposable submodular function F = \\sum_{i=1}^N f^i under constraints, focusing on reducing oracle calls to the constituent functions f^i. It introduces a first mini-batch greedy algorithm that samples a fresh batch at each iteration, and analyzes both uniform and weighted sampling, showing that uniform mini-batch often outperforms weighted sampling in practice. The authors develop two smoothed-analysis models (Model 1 and Model 2) to justify the empirical superiority of uniform sampling, proving high-probability approximation guarantees under curvature and p-system constraints with near-linear preprocessing and sub-quadratic execution costs. Empirically, the approach matches or surpasses sparsifier-based methods across diverse real-world datasets, with complexity that is effectively independent of N in the uniform setting, making it well-suited for massive datasets and scalable submodular optimization.

Abstract

We present the first mini-batch algorithm for maximizing a non-negative monotone decomposable submodular function, $F=\sum_{i=1}^N f^i$, under a set of constraints. We consider two sampling approaches: uniform and weighted. We first show that mini-batch with weighted sampling improves over the state of the art sparsifier based approach both in theory and in practice. Surprisingly, our experimental results show that uniform sampling is superior to weighted sampling. However, it is impossible to explain this using worst-case analysis. Our main contribution is using smoothed analysis to provide a theoretical foundation for our experimental results. We show that, under very mild assumptions, uniform sampling is superior for both the mini-batch and the sparsifier approaches. We empirically verify that these assumptions hold for our datasets. Uniform sampling is simple to implement and has complexity independent of $N$, making it the perfect candidate to tackle massive real-world datasets.

Mini-batch Submodular Maximization

TL;DR

This work tackles maximizing a non-negative monotone decomposable submodular function F = \\sum_{i=1}^N f^i under constraints, focusing on reducing oracle calls to the constituent functions f^i. It introduces a first mini-batch greedy algorithm that samples a fresh batch at each iteration, and analyzes both uniform and weighted sampling, showing that uniform mini-batch often outperforms weighted sampling in practice. The authors develop two smoothed-analysis models (Model 1 and Model 2) to justify the empirical superiority of uniform sampling, proving high-probability approximation guarantees under curvature and p-system constraints with near-linear preprocessing and sub-quadratic execution costs. Empirically, the approach matches or surpasses sparsifier-based methods across diverse real-world datasets, with complexity that is effectively independent of N in the uniform setting, making it well-suited for massive datasets and scalable submodular optimization.

Abstract

We present the first mini-batch algorithm for maximizing a non-negative monotone decomposable submodular function, , under a set of constraints. We consider two sampling approaches: uniform and weighted. We first show that mini-batch with weighted sampling improves over the state of the art sparsifier based approach both in theory and in practice. Surprisingly, our experimental results show that uniform sampling is superior to weighted sampling. However, it is impossible to explain this using worst-case analysis. Our main contribution is using smoothed analysis to provide a theoretical foundation for our experimental results. We show that, under very mild assumptions, uniform sampling is superior for both the mini-batch and the sparsifier approaches. We empirically verify that these assumptions hold for our datasets. Uniform sampling is simple to implement and has complexity independent of , making it the perfect candidate to tackle massive real-world datasets.
Paper Structure (44 sections, 12 theorems, 22 equations, 2 figures, 1 table, 2 algorithms)

This paper contains 44 sections, 12 theorems, 22 equations, 2 figures, 1 table, 2 algorithms.

Key Result

Theorem 1

Algorithm alg: meta minibatch greedy with an $(1-\epsilon)$-approximate incremental oracle has the following guarantees w.h.p: (1) It achieves a $(1-1/e-{\epsilon})$-approximation under a cardinality constraint $k$goundan2007revisiting. (2) It achieves a $(\frac{1-{\epsilon}}{1+p})$-approximation un

Figures (2)

  • Figure 1: Sparsifier and mini-batch compared with lazy-greedy.
  • Figure 2: Sparsifier and mini-batch compared with stochastic-greedy for ${\epsilon}=0.1$ and ${\epsilon}=0.2$.

Theorems & Definitions (18)

  • Theorem 1
  • Theorem 2
  • Theorem 3
  • Theorem 4
  • Lemma 5
  • Theorem 6: Chernoff bound MotwaniR95
  • Lemma 7
  • proof
  • proof : Proof of Theorem \ref{['thm:minibatch approx guarantees']}
  • Theorem 8: Bounded dependency Chernoff bound Pemmaraju01
  • ...and 8 more