Robust Information Selection for Hypothesis Testing with Misclassification Penalties
Jayanth Bhargav, Shreyas Sundaram, Mahsa Ghasemi
TL;DR
The paper addresses robust information selection for Bayesian hypothesis testing under adversarial disruption of information sources. It introduces a misclassification-penalty framework and defines the Robust Minimum Penalty Information Set Selection (R-MPIS) problem to minimize the worst-case misclassification penalty subject to a budget $K$ and an attack budget $A$, using observationally equivalent sets $F_{ heta}(rsistence)$ and a penalty matrix $oldsymbol{\xi}$. Through curvature analysis, it develops a robust greedy algorithm with near-optimal guarantees and then proposes a submodular surrogate objective (M-RMPIS) with stronger, easier-to-satisfy guarantees via a robust submodular maximization approach. Empirically, it validates the methods on a naval-threat surveillance case and randomly generated instances, showing near-optimal performance under source disruptions and illustrating the practical viability of the greedy schemes. The work advances robust information design for high-stakes hypothesis testing, combining theoretical guarantees with empirically demonstrated robustness against sensor failures or attacks.
Abstract
We study the problem of robust information selection for a Bayesian hypothesis testing / classification task, where the goal is to identify the true state of the world from a finite set of hypotheses based on observations from the selected information sources. We introduce a novel misclassification penalty framework, which enables non-uniform treatment of different misclassification events. Extending the classical subset selection framework, we study the problem of selecting a subset of sources that minimize the maximum penalty of misclassification under a limited budget, despite deletions or failures of a subset of the selected sources. We characterize the curvature properties of the objective function and propose an efficient greedy algorithm with performance guarantees. Next, we highlight certain limitations of optimizing for the maximum penalty metric and propose a submodular surrogate metric to guide the selection of the information set. We propose a greedy algorithm with near-optimality guarantees for optimizing the surrogate metric. Finally, we empirically demonstrate the performance of our proposed algorithms in several instances of the information set selection problem.
