Deletion Robust Non-Monotone Submodular Maximization over Matroids
Paul Dütting, Federico Fusco, Silvio Lattanzi, Ashkan Norouzi-Fard, Morteza Zadimoghaddam
TL;DR
We address deletion-robust submodular maximization under matroid constraints, where an oblivious adversary can delete up to $d$ elements after a compact summary $W$ is produced and the final solution is chosen from $W\setminus D$. The approach combines a threshold-based bucketing scheme, random sampling within large buckets, and matroid-aware injective mappings to preserve value after deletions, with a Phase II that uses a standard submodular maximization routine under the matroid. The results deliver constant-factor guarantees with near-optimal space: in the centralized setting, $(4.597+O(\varepsilon))$-approximation with $O\left(\frac{k+d}{\varepsilon^2}\log \frac{k}{\varepsilon}\right)$ summary, improved to $(3.582+O(\varepsilon))$ for monotone; in the streaming setting, $(9.435+O(\varepsilon))$ with $O\left(k + \frac{d}{\varepsilon^2}\log \frac{k}{\varepsilon}\right)$ memory, improved to $(5.582+O(\varepsilon))$ for monotone. These results establish space-efficient constant-factor algorithms for deletion-robust submodular maximization over general matroids, enabling practical, privacy-preserving or preference-updating data summarization and recommendation tasks. The techniques open avenues for extending to other constraints and fully dynamic scenarios.
Abstract
Maximizing a submodular function is a fundamental task in machine learning and in this paper we study the deletion robust version of the problem under the classic matroids constraint. Here the goal is to extract a small size summary of the dataset that contains a high value independent set even after an adversary deleted some elements. We present constant-factor approximation algorithms, whose space complexity depends on the rank $k$ of the matroid and the number $d$ of deleted elements. In the centralized setting we present a $(4.597+O(\varepsilon))$-approximation algorithm with summary size $O( \frac{k+d}{\varepsilon^2}\log \frac{k}{\varepsilon})$ that is improved to a $(3.582+O(\varepsilon))$-approximation with $O(k + \frac{d}{\varepsilon^2}\log \frac{k}{\varepsilon})$ summary size when the objective is monotone. In the streaming setting we provide a $(9.435 + O(\varepsilon))$-approximation algorithm with summary size and memory $O(k + \frac{d}{\varepsilon^2}\log \frac{k}{\varepsilon})$; the approximation factor is then improved to $(5.582+O(\varepsilon))$ in the monotone case.
