Nonparametric Estimation of Joint Entropy via Partitioned Sample-Spacing
Jungwoo Ho, Sangun Park, Soyeong Oh
TL;DR
<3-5 sentence high-level summary>
Abstract
We propose a nonparametric estimator of multivariate joint entropy based on partitioned sample spacing (PSS). The method extends univariate spacing ideas to $\mathbb{R}^{d}$ by partitioning into localized cells and aggregating within-cell statistics, with strong consistency guarantees under mild conditions. In benchmarks across diverse distributions, PSS consistently outperforms $k$-nearest neighbor estimators and achieves accuracy competitive with recent normalizing flow-based methods, while requiring no training or auxiliary density modeling. The estimator scales favorably in moderately high dimensions ($d = 10$--$40$) and shows particular robustness to correlated or skewed distributions. These properties position PSS as a practical and reliable alternative to both $k$NN and NF-based entropy estimators, with broad utility in information-theoretic machine learning tasks such as total-correlation estimation, representation learning, and feature selection.
