Table of Contents
Fetching ...

Directional testing for one-way MANOVA in divergent dimensions

Caizhu Huang, Claudia Di Caterina, Nicola Sartori

TL;DR

This work introduces an exact directional test for the equality of g normal mean vectors under a common covariance in high-dimensional settings, requiring $\sum_{i=1}^g n_i \ge p+g+1$. The method delivers an exact Uniform$(0,1)$ directional $p$-value when covariances are identical, and reduces to Hotelling's $T^2$ for $g=2$; in the more general Behrens-Fisher-like scenario with unequal covariances, the directional test remains competitive and often superior to standard likelihood-based approaches in moderate dimensions. Across extensive simulations, the directional test exhibits robust size control, outperforming LRT, Bartlett-corrected tests, and Skovgaard adjustments in many high-dimensional settings, and showing reasonable resilience to misspecification. Real-data applications in sports physiology and genomics illustrate the method's practical impact for high-dimensional MANOVA problems, with implications for better calibrated inference in biomedical research.

Abstract

Testing the equality of mean vectors across $g$ different groups plays an important role in many scientific fields. In regular frameworks, likelihood-based statistics under the normality assumption offer a general solution to this task. However, the accuracy of standard asymptotic results is not reliable when the dimension $p$ of the data is large relative to the sample size $n_i$ of each group. We propose here an exact directional test for the equality of $g$ normal mean vectors with identical unknown covariance matrix in a high dimensional setting, provided that $\sum_{i=1}^g n_i \ge p+g+1$. In the case of two groups ($g=2$), the directional test coincides with the Hotelling's $T^2$ test. In the more general situation where the $g$ independent groups may have different unknown covariance matrices, although exactness does not hold, simulation studies show that the directional test is more accurate than most commonly used likelihood{-}based solutions, at least in a moderate dimensional setting in which $p=O(n_i^τ)$, $τ\in (0,1)$. Robustness of the directional approach and its competitors under deviation from the assumption of multivariate normality is also numerically investigated. Our proposal is here applied to data on blood characteristics of male athletes and to microarray data storing gene expressions in patients with breast tumors.

Directional testing for one-way MANOVA in divergent dimensions

TL;DR

This work introduces an exact directional test for the equality of g normal mean vectors under a common covariance in high-dimensional settings, requiring . The method delivers an exact Uniform directional -value when covariances are identical, and reduces to Hotelling's for ; in the more general Behrens-Fisher-like scenario with unequal covariances, the directional test remains competitive and often superior to standard likelihood-based approaches in moderate dimensions. Across extensive simulations, the directional test exhibits robust size control, outperforming LRT, Bartlett-corrected tests, and Skovgaard adjustments in many high-dimensional settings, and showing reasonable resilience to misspecification. Real-data applications in sports physiology and genomics illustrate the method's practical impact for high-dimensional MANOVA problems, with implications for better calibrated inference in biomedical research.

Abstract

Testing the equality of mean vectors across different groups plays an important role in many scientific fields. In regular frameworks, likelihood-based statistics under the normality assumption offer a general solution to this task. However, the accuracy of standard asymptotic results is not reliable when the dimension of the data is large relative to the sample size of each group. We propose here an exact directional test for the equality of normal mean vectors with identical unknown covariance matrix in a high dimensional setting, provided that . In the case of two groups (), the directional test coincides with the Hotelling's test. In the more general situation where the independent groups may have different unknown covariance matrices, although exactness does not hold, simulation studies show that the directional test is more accurate than most commonly used likelihood{-}based solutions, at least in a moderate dimensional setting in which , . Robustness of the directional approach and its competitors under deviation from the assumption of multivariate normality is also numerically investigated. Our proposal is here applied to data on blood characteristics of male athletes and to microarray data storing gene expressions in patients with breast tumors.
Paper Structure (32 sections, 1 theorem, 50 equations, 17 figures, 15 tables)

This paper contains 32 sections, 1 theorem, 50 equations, 17 figures, 15 tables.

Key Result

Lemma 1

The estimator $\hat{\Lambda}^{-1}(t)$ is positive definite if and only if $t \in \left[0, 1/\sqrt{\nu_{(p)}}\right]$, where $\nu_{(p)}$ is the largest eigenvalue of $(B_0^\top)^{-1} (A/n) B_0^{-1}$, with $\hat{\Lambda}^{-1}_0 = B_0^\top B_0$.

Figures (17)

  • Figure 1: Empirical null distribution of $p$-values from the likelihood ratio test (left) and directional test (right) for the hypothesis of equality of normal mean vectors in $g=4$ groups with identical covariance matrix, based on $10,000$ Monte Carlo simulations. Data are generated from a $N_5(\mu, \Lambda^{-1})$ distribution with mean vector $\mu$ and covariance matrix $\Lambda^{-1}$ equal to the sample mean and sample covariance matrix, respectively, of the Pottery dataset. The total sample size is $n = \sum_{i=1}^{4}n_i=26$.
  • Figure 2: Empirical size of the directional test (DT), central limit theorem test (CLT), likelihood ratio test (LRT), Bartlett corrected test (BC) and two skovgaard:2001's modifications (Sko1 and Sko2) for hypothesis (\ref{['hypothesis:dirmean']}) with $g=3$, at nominal level $\alpha = 0.05$ given by the dashed gray horizontal line. The left and right panels correspond to $n_i = 100, 500$, respectively.
  • Figure 3: Empirical size of the directional test (DT), Behrens-Fisher test (BF) nel1986, likelihood ratio test (LRT), and two skovgaard:2001's modifications (Sko1 and Sko2) for hypothesis (\ref{['hypothesis:meansdiff']}) with $g=2$, at nominal level $\alpha = 0.05$ given by the dashed gray horizontal line. The left and right panels correspond to $n_i = 100, 500$, respectively.
  • Figure S1: Empirical size of the directional test (DT), central limit theorem test (CLT), likelihood ratio test (LRT), Bartlett corrected test (BC) and two skovgaard:2001's modifications (Sko1 and Sko2) for hypothesis (8) with $g=3$, at nominal level $\alpha = 0.05$ given by the dashed gray horizontal line. The left, middle and right panels correspond to $n_i = 1000$, respectively.
  • Figure S2: Empirical size of the directional test (DT), central limit theorem test (CLT), likelihood ratio test (LRT), Bartlett correction (BC) and two skovgaard:2001's modifications (Sko1 and Sko2) for hypothesis (8) in the paper, at nominal level $\alpha = 0.05$ given by the gray horizontal line. The left, middle and right panels correspond to $n_i = 100, 500, 1000$, respectively ($g=3$).
  • ...and 12 more figures

Theorems & Definitions (1)

  • Lemma 1