Table of Contents
Fetching ...

A Normal Test for Independence via Generalized Mutual Information

Jialin Zhang, Zhiyi Zhang

TL;DR

The paper introduces a normal-test for independence between discrete random elements by leveraging escort distributions and a generalized mutual information framework. It shows that a $Z_{AB}$ statistic, derived from decomposing the generalized mutual information into two components, is asymptotically $N(0,1)$ under $H_0$ without needing the contingency table dimensions, and it is consistent against all alternatives. Through simulations, the method demonstrates robust performance in large or sparse contingency tables where Pearson's chi-squared test struggles, though it may incur some loss of power in small dense tables. This yields a practical alternative when standard chi-squared assumptions fail, with guidance on when to prefer the normal-test versus Pearson's test based on table size and sparsity.

Abstract

Testing hypothesis of independence between two random elements on a joint alphabet is a fundamental exercise in statistics. Pearson's chi-squared test is an effective test for such a situation when the contingency table is relatively small. General statistical tools are lacking when the contingency data tables are large or sparse. A test based on generalized mutual information is derived and proposed in this article. The new test has two desired theoretical properties. First, the test statistic is asymptotically normal under the hypothesis of independence; consequently it does not require the knowledge of the row and column sizes of the contingency table. Second, the test is consistent and therefore it would detect any form of dependence structure in the general alternative space given a sufficiently large sample. In addition, simulation studies show that the proposed test converges faster than Pearson's chi-squared test when the contingency table is large or sparse.

A Normal Test for Independence via Generalized Mutual Information

TL;DR

The paper introduces a normal-test for independence between discrete random elements by leveraging escort distributions and a generalized mutual information framework. It shows that a statistic, derived from decomposing the generalized mutual information into two components, is asymptotically under without needing the contingency table dimensions, and it is consistent against all alternatives. Through simulations, the method demonstrates robust performance in large or sparse contingency tables where Pearson's chi-squared test struggles, though it may incur some loss of power in small dense tables. This yields a practical alternative when standard chi-squared assumptions fail, with guidance on when to prefer the normal-test versus Pearson's test based on table size and sparsity.

Abstract

Testing hypothesis of independence between two random elements on a joint alphabet is a fundamental exercise in statistics. Pearson's chi-squared test is an effective test for such a situation when the contingency table is relatively small. General statistical tools are lacking when the contingency data tables are large or sparse. A test based on generalized mutual information is derived and proposed in this article. The new test has two desired theoretical properties. First, the test statistic is asymptotically normal under the hypothesis of independence; consequently it does not require the knowledge of the row and column sizes of the contingency table. Second, the test is consistent and therefore it would detect any form of dependence structure in the general alternative space given a sufficiently large sample. In addition, simulation studies show that the proposed test converges faster than Pearson's chi-squared test when the contingency table is large or sparse.
Paper Structure (5 sections, 4 theorems, 23 equations, 1 table)

This paper contains 5 sections, 4 theorems, 23 equations, 1 table.

Key Result

Lemma 1

Given $\lambda>0$,

Theorems & Definitions (6)

  • Lemma 1
  • Proposition 1
  • Proposition 2
  • Proposition 3
  • proof : Proof of Proposition \ref{['prop1']}
  • proof : Proof of Proposition \ref{['prop3']}