Classifying Overlapping Gaussian Mixtures in High Dimensions: From Optimal Classifiers to Neural Nets
Khen Cohen, Noam Levi, Yaron Oz
TL;DR
The paper derives Bayes-optimal decision boundaries for binary classification under overlapping high-dimensional Gaussian mixtures, detailing population and empirical limits and clarifying how covariance eigenvalues and eigenvectors shape the boundaries. It then shows neural networks, including two-layer quadratic-activation models, can closely approximate these Bayes rules, with KKT convergence offering a theoretical lens on gradient dynamics. Through toy models and real-data experiments (FMNIST, CIFAR-10) with covariance-flip and spectral tests, the authors demonstrate that eigenvectors, more than eigenvalues, largely determine decision thresholds in high dimensions, providing a principled link between GMM theory and neural network behavior. The findings illuminate how neural networks distill probabilistic structure from complex distributions and offer practical intuition for when and why covariance geometry guides learned classifiers.
Abstract
We derive closed-form expressions for the Bayes optimal decision boundaries in binary classification of high dimensional overlapping Gaussian mixture model (GMM) data, and show how they depend on the eigenstructure of the class covariances, for particularly interesting structured data. We empirically demonstrate, through experiments on synthetic GMMs inspired by real-world data, that deep neural networks trained for classification, learn predictors which approximate the derived optimal classifiers. We further extend our study to networks trained on authentic data, observing that decision thresholds correlate with the covariance eigenvectors rather than the eigenvalues, mirroring our GMM analysis. This provides theoretical insights regarding neural networks' ability to perform probabilistic inference and distill statistical patterns from intricate distributions.
