Table of Contents
Fetching ...

Localized Distributional Robustness in Submodular Multi-Task Subset Selection

Ege C. Kaya, Abolfazl Hashemi

TL;DR

This work proposes to introduce a regularization term which makes use of the relative entropy to the standard multi-task objective, and demonstrates through duality that this novel formulation itself is equivalent to the maximization of a monotone increasing function composed with a submodular function, which may be efficiently carried out through standard greedy selection methods.

Abstract

In this work, we approach the problem of multi-task submodular optimization with the perspective of local distributional robustness, within the neighborhood of a reference distribution which assigns an importance score to each task. We initially propose to introduce a regularization term which makes use of the relative entropy to the standard multi-task objective. We then demonstrate through duality that this novel formulation itself is equivalent to the maximization of a monotone increasing function composed with a submodular function, which may be efficiently carried out through standard greedy selection methods. This approach bridges the existing gap in the optimization of performance-robustness trade-offs in multi-task subset selection. To numerically validate our theoretical results, we test the proposed method in two different settings, one on the selection of satellites in low Earth orbit constellations in the context of a sensor selection problem involving weak-submodular functions, and the other on an image summarization task using neural networks involving submodular functions. Our method is compared with two other algorithms focused on optimizing the performance of the worst-case task, and on directly optimizing the performance on the reference distribution itself. We conclude that our novel formulation produces a solution that is locally distributional robust, and computationally inexpensive.

Localized Distributional Robustness in Submodular Multi-Task Subset Selection

TL;DR

This work proposes to introduce a regularization term which makes use of the relative entropy to the standard multi-task objective, and demonstrates through duality that this novel formulation itself is equivalent to the maximization of a monotone increasing function composed with a submodular function, which may be efficiently carried out through standard greedy selection methods.

Abstract

In this work, we approach the problem of multi-task submodular optimization with the perspective of local distributional robustness, within the neighborhood of a reference distribution which assigns an importance score to each task. We initially propose to introduce a regularization term which makes use of the relative entropy to the standard multi-task objective. We then demonstrate through duality that this novel formulation itself is equivalent to the maximization of a monotone increasing function composed with a submodular function, which may be efficiently carried out through standard greedy selection methods. This approach bridges the existing gap in the optimization of performance-robustness trade-offs in multi-task subset selection. To numerically validate our theoretical results, we test the proposed method in two different settings, one on the selection of satellites in low Earth orbit constellations in the context of a sensor selection problem involving weak-submodular functions, and the other on an image summarization task using neural networks involving submodular functions. Our method is compared with two other algorithms focused on optimizing the performance of the worst-case task, and on directly optimizing the performance on the reference distribution itself. We conclude that our novel formulation produces a solution that is locally distributional robust, and computationally inexpensive.
Paper Structure (13 sections, 4 theorems, 59 equations, 7 figures, 2 algorithms)

This paper contains 13 sections, 4 theorems, 59 equations, 7 figures, 2 algorithms.

Key Result

Theorem 1

mirzasoleiman2015lazier Let $f\vcentcolon 2^N \rightarrow \mathbb{R}$ be a normalized, monotone nondecreasing submodular function. Let $R = (\lvert N\rvert/K)\log(1/\epsilon)$ be the size of the sampled set at each iteration of Stochastic Greedy used in the solution of Problem cardinality, where $\e where $S^\ast$ is a maximizer of Problem cardinality.

Figures (7)

  • Figure 1: Illustration of the three discrete distributions on the $3-$dimensional simplexsimpleximage that the three discussed approaches optimize. $P_{\text{worst}}$ corresponds to the global worst-case task scenario, assigning a weight of $1$ to the worst-case task and a weight of $0$ to all the others, residing on a vertex of the simplex. $P_{\text{avg}}$ assigns uniform weight to all tasks, and lies in the center of the simplex. $Q$ is the reference distribution within a neighborhood of which we want to achieve local robustness. $P^\ast$ is the local worst-case distribution within that neighborhood of $Q$.
  • Figure 2: The average performances over fifteen runs of the three algorithms focused on optimizing the reference distribution, the global worst-case task, and the local worst-case distribution as guided by the reference distribution, evaluated on the four criteria of reference distribution performance, worst-case task performance, local worst-case distribution performance and the wall-clock time elapsed taken by the algorithm in the construction of the solution in the satellite selection task of Section \ref{['sec:results']}, Subsection \ref{['subsec:satsel']}. Local represents the relative entropy-regularized Stochastic Greedy algorithm solving our novel formulation, aiming for local worst-case distributional robustness in the neighborhood of the reference distribution. Saturate (Global) represents the Submodular Saturation Algorithm proposed in RSOS, aiming for global worst-case task robustness. Reference represents the Stochastic Greedy algorithm being used to directly optimize the utility of the reference distribution. The highlighted areas indicate one standard deviation. The results have been put through a moving average filter with window size 6.
  • Figure 3: A selection of atmospheric points of interest for the five atmospheric reading tasks, $f^1, \ldots, f^5$ in one run of the simulation. Each task instantiates five points of interest, for a total of twenty-five points. The labels near the points indicate which atmospheric task a point belongs to.
  • Figure 4: The satellites selected by Algorithms 1, 2, and 3 on the 25th time iteration of one run of the simulation. The red points with their corresponding numbers indicate the atmospheric points of interest and the tasks they belong to. The blue points indicate the satellites in the constellation. The green points indicate the selected satellites at the current time iteration. The highlighted green areas indicate the ground coverage provided by the selected satellites. The reference distribution for this run of the simulation is $Q = [0.022, 0.267, 0.088, 0.087, 0.183, 0.353].$
  • Figure 5: The average performances over fifteen runs of Saturate with Preference in comparison to the Submodular Saturation Algorithm on the two objective functions with the highest assigned weight. The highlighted areas indicate one tenth standard deviation. The results have been put through a moving average filter with window size 6.
  • ...and 2 more figures

Theorems & Definitions (13)

  • Definition 1: Marginal gain
  • Definition 2: Submodularity
  • Definition 3: Normalized set functions
  • Definition 4: Monotone nondecreasing set functions
  • Definition 5: Weak-submodularity constant (WSC) hibbard_hashemi_tanaka_topcu_2023
  • Definition 6: Weak submodularity
  • Theorem 1: Stochastic Greedy approximation ratio
  • Theorem 2
  • proof
  • Theorem 3
  • ...and 3 more