hyppo: A Multivariate Hypothesis Testing Python Package
Sambit Panda, Satish Palaniappan, Junhao Xiong, Eric W. Bridgeford, Ronak Mehta, Cencheng Shen, Joshua T. Vogelstein
TL;DR
hyppo addresses the need for a unified, high-power Python platform for multivariate hypothesis testing, covering independence, two-sample, and k-sample problems with a broad suite of state-of-the-art tests. The library offers a consistent API, parallelized permutation inference, and JIT-compiled test statistics, with extensive benchmarks showing competitive performance and close agreement with R implementations. Key contributions include the wide test repertoire (distance- and kernel-based methods, time-series and conditional tests), a modular structure, and open-source availability with documentation. The work enables researchers to perform robust multivariate testing in Python with extensible, well-tested tooling and transparent performance comparisons, supporting rapid application and extension in scientific workflows. p-values are computed via permutation tests, enhancing nonparametric inference across diverse data regimes.
Abstract
We introduce hyppo, a unified library for performing multivariate hypothesis testing, including independence, two-sample, and k-sample testing. While many multivariate independence tests have R packages available, the interfaces are inconsistent and most are not available in Python. hyppo includes many state of the art multivariate testing procedures. The package is easy-to-use and is flexible enough to enable future extensions. The documentation and all releases are available at https://hyppo.neurodata.io.
