Submodular Information Selection for Hypothesis Testing with Misclassification Penalties

Jayanth Bhargav; Mahsa Ghasemi; Shreyas Sundaram

Submodular Information Selection for Hypothesis Testing with Misclassification Penalties

Jayanth Bhargav, Mahsa Ghasemi, Shreyas Sundaram

TL;DR

Submodular Information Selection for Hypothesis Testing with Misclassification Penalties develops a principled framework for selecting a subset of information sources to identify the true hypothesis under misclassification penalties within a centralized Bayesian setting. It introduces two main problems, MCIS and MPIS, and proves that their objective/constraint structures are weakly submodular, enabling near-optimal greedy guarantees; it also proposes a fully submodular alternate metric (total-penalty) for stronger guarantees. The authors provide finite-sample convergence results for Bayesian beliefs and validate the theory with extensive simulations on a 10-class aerial vehicle classification task, demonstrating robust, near-optimal performance of greedy information-set selection under budget and cost constraints. The work offers practical methods for robust, resource-aware information acquisition in autonomous systems, with implications for sensor/feature selection and hypothesis testing under asymmetric penalties.

Abstract

We consider the problem of selecting an optimal subset of information sources for a hypothesis testing/classification task where the goal is to identify the true state of the world from a finite set of hypotheses, based on finite observation samples from the sources. In order to characterize the learning performance, we propose a misclassification penalty framework, which enables nonuniform treatment of different misclassification errors. In a centralized Bayesian learning setting, we study two variants of the subset selection problem: (i) selecting a minimum cost information set to ensure that the maximum penalty of misclassifying the true hypothesis is below a desired bound and (ii) selecting an optimal information set under a limited budget to minimize the maximum penalty of misclassifying the true hypothesis. Under certain assumptions, we prove that the objective (or constraints) of these combinatorial optimization problems are weak (or approximate) submodular, and establish high-probability performance guarantees for greedy algorithms. Further, we propose an alternate metric for information set selection which is based on the total penalty of misclassification. We prove that this metric is submodular and establish near-optimal guarantees for the greedy algorithms for both the information set selection problems. Finally, we present numerical simulations to validate our theoretical results over several randomly generated instances.

Submodular Information Selection for Hypothesis Testing with Misclassification Penalties

TL;DR

Abstract

Paper Structure (25 sections, 17 theorems, 49 equations, 4 figures, 2 algorithms)

This paper contains 25 sections, 17 theorems, 49 equations, 4 figures, 2 algorithms.

Introduction
Related Work
Contributions
Minimum-Cost Information Set Selection Problem
Weak Submodularity and Greedy Algorithm
Minimum-Penalty Information Set Selection
Alternate Penalty Metric for Information Set Selection
Empirical Evaluation
Conclusion
Proofs
Proof of Theorem \ref{['thm2']}
Proof of Corollary \ref{['coro1']}
Proof of Lemma \ref{['lma:submod']}
Proof of Lemma \ref{['lma2']}
Proof of Theorem \ref{['thm:greedy']}
...and 10 more sections

Key Result

Theorem 2

Let the true state of the world be $\theta_p$ and let $\mu_0 (\theta) = \frac{1}{m} \space \forall \theta \in \Theta$ (i.e., uniform prior). Under Assumption 1, for any $\delta, \epsilon \in [0,1]$, and L as defined in Equation (eq:kld_ratio), and for an information set $\mathcal{I} \subseteq \mathc where $K(\theta_p, \theta_q) = D_{KL}(\ell_{\mathcal{I}}(\cdot | \theta_p) || \ell_{\mathcal{I}}(\c

Figures (4)

Figure 1: (a) Penalty Matrix for the Aerial Vehicle Classification (AVC) task, (b) Performance of Algorithm \ref{['alg:greedy']} (for Problem \ref{['prob:dsrc']}), (c) Performance of Algorithm \ref{['alg:greedy2']} (for Problem \ref{['prob:mpis']}).
Figure 2: Finite Sample Convergence of Bayesian Beliefs
Figure 3: Performance of greedy algorithm for information set selection for varying submodularity ratios (a) Algorithm \ref{['alg:greedy']} (Problem \ref{['prob:dsrc']}), (b) Algorithm \ref{['alg:greedy2']} (Problem \ref{['prob:mpis']}).
Figure 4: Performance of Greedy (a) Algorithm \ref{['alg:greedy']} (Problem \ref{['prob:mmcis']}), (b) Algorithm \ref{['alg:greedy2']} (Problem \ref{['prob:mmpis']}).

Theorems & Definitions (22)

Definition 1: Observationally Equivalent Set
Theorem 2
Corollary 3
Definition 4: Monotonicity
Definition 5: Submodularity Ratio
Remark 6
Lemma 7
Lemma 8
Theorem 9
Corollary 10
...and 12 more

Submodular Information Selection for Hypothesis Testing with Misclassification Penalties

TL;DR

Abstract

Submodular Information Selection for Hypothesis Testing with Misclassification Penalties

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (22)