Computing Approximate Pareto Frontiers for Submodular Utility and Cost Tradeoffs
Karan Vombatkere, Evimaria Terzi
TL;DR
This work addresses selecting subsets that balance monotone submodular utility $f$ against a minimization cost $c$ by introducing the Pareto--$\langle f,c\rangle$ framework and $(\alpha_1,\alpha_2)$-approximate frontiers. It develops specialized algorithms for different cost models: C-Greedy and F-Greedy for the $\langle f,\texttt{Cardinality} \rangle$ and $\langle f,\texttt{Linear} \rangle$ cases, respectively, plus FC-Greedy and Pareto-Greedy to handle general settings, and a metric-ball based method C-Greedy-Diameter for the $\langle f,\texttt{Diameter} \rangle$ case. The paper proves theoretical guarantees for these frontiers and demonstrates their practical effectiveness across team formation, recommender systems, and influence maximization datasets, producing polynomial-size, representative trade-off sets. By enabling exploration of multiple utility–cost operating points, the framework offers principled guidance for budgets and stakeholder preferences in real-world submodular optimization tasks.
Abstract
In many data-mining applications, including recommender systems, influence maximization, and team formation, the goal is to pick a subset of elements (e.g., items, nodes in a network, experts to perform a task) to maximize a monotone submodular utility function while simultaneously minimizing a cost function. Classical formulations model this tradeoff via cardinality or knapsack constraints, or by combining utility and cost into a single weighted objective. However, such approaches require committing to a specific tradeoff in advance and return only a single solution, offering limited insight into the space of viable utility-cost tradeoffs. In this paper, we depart from the single-solution paradigm and examine the problem of computing representative sets of high-quality solutions that expose different tradeoffs between submodular utility and cost. For this, we introduce $(α_1,α_2)$-approximate Pareto frontiers that provably approximate the achievable tradeoffs between submodular utility and cost. Specifically, we formalize the Pareto-$\langle f,c \rangle$ problem and develop efficient algorithms for multiple instantiations arising from different combinations of submodular utility $f$ and cost functions $c$. Our results offer a principled and practical framework for understanding and exploiting utility-cost tradeoffs in submodular optimization. Experiments on datasets from diverse application domains demonstrate that our algorithms efficiently compute approximate Pareto frontiers in practice.
