Table of Contents
Fetching ...

Computing Approximate Pareto Frontiers for Submodular Utility and Cost Tradeoffs

Karan Vombatkere, Evimaria Terzi

TL;DR

This work addresses selecting subsets that balance monotone submodular utility $f$ against a minimization cost $c$ by introducing the Pareto--$\langle f,c\rangle$ framework and $(\alpha_1,\alpha_2)$-approximate frontiers. It develops specialized algorithms for different cost models: C-Greedy and F-Greedy for the $\langle f,\texttt{Cardinality} \rangle$ and $\langle f,\texttt{Linear} \rangle$ cases, respectively, plus FC-Greedy and Pareto-Greedy to handle general settings, and a metric-ball based method C-Greedy-Diameter for the $\langle f,\texttt{Diameter} \rangle$ case. The paper proves theoretical guarantees for these frontiers and demonstrates their practical effectiveness across team formation, recommender systems, and influence maximization datasets, producing polynomial-size, representative trade-off sets. By enabling exploration of multiple utility–cost operating points, the framework offers principled guidance for budgets and stakeholder preferences in real-world submodular optimization tasks.

Abstract

In many data-mining applications, including recommender systems, influence maximization, and team formation, the goal is to pick a subset of elements (e.g., items, nodes in a network, experts to perform a task) to maximize a monotone submodular utility function while simultaneously minimizing a cost function. Classical formulations model this tradeoff via cardinality or knapsack constraints, or by combining utility and cost into a single weighted objective. However, such approaches require committing to a specific tradeoff in advance and return only a single solution, offering limited insight into the space of viable utility-cost tradeoffs. In this paper, we depart from the single-solution paradigm and examine the problem of computing representative sets of high-quality solutions that expose different tradeoffs between submodular utility and cost. For this, we introduce $(α_1,α_2)$-approximate Pareto frontiers that provably approximate the achievable tradeoffs between submodular utility and cost. Specifically, we formalize the Pareto-$\langle f,c \rangle$ problem and develop efficient algorithms for multiple instantiations arising from different combinations of submodular utility $f$ and cost functions $c$. Our results offer a principled and practical framework for understanding and exploiting utility-cost tradeoffs in submodular optimization. Experiments on datasets from diverse application domains demonstrate that our algorithms efficiently compute approximate Pareto frontiers in practice.

Computing Approximate Pareto Frontiers for Submodular Utility and Cost Tradeoffs

TL;DR

This work addresses selecting subsets that balance monotone submodular utility against a minimization cost by introducing the Pareto-- framework and -approximate frontiers. It develops specialized algorithms for different cost models: C-Greedy and F-Greedy for the and cases, respectively, plus FC-Greedy and Pareto-Greedy to handle general settings, and a metric-ball based method C-Greedy-Diameter for the case. The paper proves theoretical guarantees for these frontiers and demonstrates their practical effectiveness across team formation, recommender systems, and influence maximization datasets, producing polynomial-size, representative trade-off sets. By enabling exploration of multiple utility–cost operating points, the framework offers principled guidance for budgets and stakeholder preferences in real-world submodular optimization tasks.

Abstract

In many data-mining applications, including recommender systems, influence maximization, and team formation, the goal is to pick a subset of elements (e.g., items, nodes in a network, experts to perform a task) to maximize a monotone submodular utility function while simultaneously minimizing a cost function. Classical formulations model this tradeoff via cardinality or knapsack constraints, or by combining utility and cost into a single weighted objective. However, such approaches require committing to a specific tradeoff in advance and return only a single solution, offering limited insight into the space of viable utility-cost tradeoffs. In this paper, we depart from the single-solution paradigm and examine the problem of computing representative sets of high-quality solutions that expose different tradeoffs between submodular utility and cost. For this, we introduce -approximate Pareto frontiers that provably approximate the achievable tradeoffs between submodular utility and cost. Specifically, we formalize the Pareto- problem and develop efficient algorithms for multiple instantiations arising from different combinations of submodular utility and cost functions . Our results offer a principled and practical framework for understanding and exploiting utility-cost tradeoffs in submodular optimization. Experiments on datasets from diverse application domains demonstrate that our algorithms efficiently compute approximate Pareto frontiers in practice.
Paper Structure (25 sections, 10 theorems, 21 equations, 4 figures, 6 tables, 6 algorithms)

This paper contains 25 sections, 10 theorems, 21 equations, 4 figures, 6 tables, 6 algorithms.

Key Result

Lemma 1

Given cardinality thresholds $\mathcal{B} = \{1,\ldots,n\}$ and $\tau \ge 0$, the C-Greedy algorithm returns an $\left(1-\frac{1}{e},1\right)$--approximate Pareto frontier $\mathcal{P}$ of size $\mathcal{O}\xspace(n)$ for the Pareto--$\langle f,\texttt{Cardinality} \rangle$ problem.

Figures (4)

  • Figure 1: Approximate Pareto frontiers for all algorithms for the Pareto--$\langle f,\texttt{Linear} \rangle$ problem. FC-Greedy is evaluated using logarithmic $\varepsilon$-grids $\mathcal{K}_{0.1}$ and $\mathcal{B}_{0.1}$, while C-Greedy and Top-K are baseline methods executed using a linear cost grid $\mathcal{B}_{\Delta}$.
  • Figure 2: Approximate Pareto frontiers for all algorithms for the Pareto--$\langle f,\texttt{Diameter} \rangle$ problem.
  • Figure 3: Approximate Pareto frontiers for all algorithms for the Pareto--$\langle f,\texttt{Cardinality} \rangle$ problem.
  • Figure 4: Representative single-run Pareto frontiers for Pareto--$\langle f,\texttt{Linear} \rangle$ across datasets. Each subfigure shows six randomly selected single samples to highlight qualitative differences, while the main text reports mean Pareto frontiers aggregated across samples.

Theorems & Definitions (12)

  • Definition 1: Pareto Point
  • Definition 2: $(\alpha_1,\alpha_2)$-approximate Pareto frontier
  • Lemma 1
  • Lemma 2
  • Lemma 3
  • Lemma 4
  • Lemma 5
  • Lemma 1
  • Lemma 2
  • Lemma 3
  • ...and 2 more