Pareto Optimization with Robust Evaluation for Noisy Subset Selection

Yi-Heng Xu; Dan-Xuan Liu; Chao Qian

Pareto Optimization with Robust Evaluation for Noisy Subset Selection

Yi-Heng Xu, Dan-Xuan Liu, Chao Qian

TL;DR

The paper tackles noisy subset selection under a cardinality constraint by modeling the objective $f$ as a monotone submodular function and addressing noise in evaluations $F(S)$, where $\mathbb{E}[F(S)]=f(S)$. It introduces PORE, a Pareto-optimization framework with a robust first objective $f_1(S)$ computed as the average of $F$ over all $(|S|-1)$-sized subsets of $S$, and a size-based second objective $f_2(S)=-|S|$, guided by $\theta$-domination and a population cap to balance robustness and efficiency. Empirical results on influence maximization and sparse regression show that PORE consistently outperforms the greedy algorithm, POSS, and PONSS, with notable gains on protein data; ablation confirms the value of the robust evaluation. The work highlights the practical potential of robust Pareto-based strategies for noisy combinatorial selection problems and opens avenues for theoretical analysis of PORE’s approximation properties.

Abstract

Subset selection is a fundamental problem in combinatorial optimization, which has a wide range of applications such as influence maximization and sparse regression. The goal is to select a subset of limited size from a ground set in order to maximize a given objective function. However, the evaluation of the objective function in real-world scenarios is often noisy. Previous algorithms, including the greedy algorithm and multi-objective evolutionary algorithms POSS and PONSS, either struggle in noisy environments or consume excessive computational resources. In this paper, we focus on the noisy subset selection problem with a cardinality constraint, where the evaluation of a subset is noisy. We propose a novel approach based on Pareto Optimization with Robust Evaluation for noisy subset selection (PORE), which maximizes a robust evaluation function and minimizes the subset size simultaneously. PORE can efficiently identify well-structured solutions and handle computational resources, addressing the limitations observed in PONSS. Our experiments, conducted on real-world datasets for influence maximization and sparse regression, demonstrate that PORE significantly outperforms previous methods, including the classical greedy algorithm, POSS, and PONSS. Further validation through ablation studies confirms the effectiveness of our robust evaluation function.

Pareto Optimization with Robust Evaluation for Noisy Subset Selection

TL;DR

The paper tackles noisy subset selection under a cardinality constraint by modeling the objective

as a monotone submodular function and addressing noise in evaluations

, where

. It introduces PORE, a Pareto-optimization framework with a robust first objective

computed as the average of

over all

-sized subsets of

, and a size-based second objective

, guided by

-domination and a population cap to balance robustness and efficiency. Empirical results on influence maximization and sparse regression show that PORE consistently outperforms the greedy algorithm, POSS, and PONSS, with notable gains on protein data; ablation confirms the value of the robust evaluation. The work highlights the practical potential of robust Pareto-based strategies for noisy combinatorial selection problems and opens avenues for theoretical analysis of PORE’s approximation properties.

Abstract

Paper Structure (14 sections, 15 equations, 5 figures, 4 algorithms)

This paper contains 14 sections, 15 equations, 5 figures, 4 algorithms.

Introduction
Noisy Subset Selection
Previous Algorithms
The Greedy Algorithm
The POSS Algorithm
The PONSS Algorithm
The Proposed PORE algorithm
Empirical Study
Influence Maximization
Sparse Regression
Ablation Study of Robust Evaluation
Comparison under Different Noise Intensities
Influence of Different Settings of $\theta$
Conclusion

Figures (5)

Figure 1: Influence maximization (influence spread: the larger the better). The left subfigure on each dataset: influence spread vs budget $k$. The right subfigure on each dataset: influence spread vs running time of PORE, PONSS and POSS for $k=7$.
Figure 2: Sparse regression ($R^2$: the larger the better). The left subfigure on each dataset: $R^2$ vs budget $k$. The right subfigure on each dataset: $R^2$ vs running time of PORE, PONSS and POSS for $k = 14$.
Figure 3: The ablation experiments on the robust evaluation, where PORE-F denotes PORE without robust evaluation.
Figure 4: Performance of the algorithms under different noise intensities. The left subfigure on ego-Facebook displays the influence spread versus simulation times for $k=7$. The right subfigure on protein displays $R^2$ versus sample size for $k=16$.
Figure 5: Performance of PORE with different $\theta$ values on the application of influence maximization, where the dataset is ego-Facebook and the budget $k=7$.

Theorems & Definitions (6)

Definition 1: Noisy Subset selection with a cardinality constraint
Definition 2: Submodularity Ratio das2011submodular
Definition 3: Influence Maximization kempe2003maximizing
Definition 4: Sparse Regression miller2002subset
Definition 5: Domination
Definition 6: $\theta$-Domination

Pareto Optimization with Robust Evaluation for Noisy Subset Selection

TL;DR

Abstract

Pareto Optimization with Robust Evaluation for Noisy Subset Selection

Authors

TL;DR

Abstract

Table of Contents

Figures (5)

Theorems & Definitions (6)