Table of Contents
Fetching ...

Compressed Hypothesis Testing: To Mix or Not to Mix?

Myung Cho, Weiyu Xu, Lifeng Lai

TL;DR

It is demonstrated that mixed observations of random variables can strictly improve the error exponent of the hypothesis testing, over separate observations of individual random variables, implying that mixed Observations can reduce the number of required samples in hypothesis testing applications.

Abstract

In this paper, we study the problem of determining $k$ anomalous random variables that have different probability distributions from the rest $(n-k)$ random variables. Instead of sampling each individual random variable separately as in the conventional hypothesis testing, we propose to perform hypothesis testing using mixed observations that are functions of multiple random variables. We characterize the error exponents for correctly identifying the $k$ anomalous random variables under fixed time-invariant mixed observations, random time-varying mixed observations, and deterministic time-varying mixed observations. For our error exponent characterization, we introduce the notions of inner conditional Chernoff information and outer conditional Chernoff information. It is demonstrated that mixed observations can strictly improve the error exponents of hypothesis testing, over separate observations of individual random variables. We further characterize the optimal sensing vector maximizing the error exponents, which leads to explicit constructions of the optimal mixed observations in special cases of hypothesis testing for Gaussian random variables. These results show that mixed observations of random variables can reduce the number of required samples in hypothesis testing applications. In order to solve large-scale hypothesis testing problems, we also propose efficient algorithms - LASSO based and message passing based hypothesis testing algorithms.

Compressed Hypothesis Testing: To Mix or Not to Mix?

TL;DR

It is demonstrated that mixed observations of random variables can strictly improve the error exponent of the hypothesis testing, over separate observations of individual random variables, implying that mixed Observations can reduce the number of required samples in hypothesis testing applications.

Abstract

In this paper, we study the problem of determining anomalous random variables that have different probability distributions from the rest random variables. Instead of sampling each individual random variable separately as in the conventional hypothesis testing, we propose to perform hypothesis testing using mixed observations that are functions of multiple random variables. We characterize the error exponents for correctly identifying the anomalous random variables under fixed time-invariant mixed observations, random time-varying mixed observations, and deterministic time-varying mixed observations. For our error exponent characterization, we introduce the notions of inner conditional Chernoff information and outer conditional Chernoff information. It is demonstrated that mixed observations can strictly improve the error exponents of hypothesis testing, over separate observations of individual random variables. We further characterize the optimal sensing vector maximizing the error exponents, which leads to explicit constructions of the optimal mixed observations in special cases of hypothesis testing for Gaussian random variables. These results show that mixed observations of random variables can reduce the number of required samples in hypothesis testing applications. In order to solve large-scale hypothesis testing problems, we also propose efficient algorithms - LASSO based and message passing based hypothesis testing algorithms.

Paper Structure

This paper contains 25 sections, 10 theorems, 117 equations, 16 figures, 2 tables, 6 algorithms.

Key Result

Theorem 3.1

Consider fixed time-invariant measurements $Y^j={{\boldsymbol a}^j}^T {\boldsymbol X}^j = {\boldsymbol a}^T {\boldsymbol X}^j$, $1 \leq j \leq m$, for $n$ random variables $X_1, X_2, ..., X_n$. Algorithms alg:likelihoodratiotest_timeinvariant and alg:pairwise_timeinvariant correctly identify the $k$ is the Chernoff information between two probability distributions $p_v$ and $p_w$.

Figures (16)

  • Figure 1: Comparison of the Chernoff information with mixed measurements against the Chernoff information with separate measurements by varying the variance in Example 4. OCI represents the outer Chernoff information, i.e., Chernoff information with mixed measurements, and CI represents the Chernoff information with separate measurements.
  • Figure 2: Comparison between the outer Chernoff information and the inner Chernoff information. The parameters $m$ and $n_r$ are set to 10 and 5 respectively.
  • Figure 3: Illustration of a factor graph (a) from a matrix (b). A random variable $X_i$ and $Y^j$ are considered as a variable node and a check node in the graph respectively.
  • Figure 4: (a) Message sent from a check node to a variable node. The message sent from a check node $Y^1$ to a variable node $X_2$ (red arrow) is expressed as the probability ${\mathbb P}_{Y^1 \rightarrow X_2} (X_2~ \text{is abnormal} \;|\; Y^1 = y^1)$ by considering probabilities ${\mathbb P}_{X_i \rightarrow Y^1}$, $i=4,5,8$ (blue arrow). (b) Message sent from a variable node to a check node. The message sent from a variable node $X_2$ to a check node $Y^1$ (red arrow) is expressed as the probability ${\mathbb P}_{X_2 \rightarrow Y^1} (X_2~ \text{is abnormal})$ by considering probability ${\mathbb P}_{Y^2 \rightarrow X_2}$.
  • Figure 5: Error probability in log scale as the number of measurements $m$ is varied, when $(n,k)=(100,1)$. The normal and abnormal random variables follow ${\mathcal{N}}(0,1)$ and ${\mathcal{N}}(0,100)$ respectively.
  • ...and 11 more figures

Theorems & Definitions (12)

  • Theorem 3.1
  • Definition 3.2: Inner Conditional Chernoff Information
  • Theorem 3.3
  • Definition 3.4: Outer Conditional Chernoff Information
  • Theorem 3.5
  • Theorem 4.1
  • Theorem 4.2
  • Lemma 5.1
  • Lemma 5.2
  • Lemma 5.3
  • ...and 2 more