Table of Contents
Fetching ...

uGMM-NN: Univariate Gaussian Mixture Model Neural Network

Zakeria Sharif Ali

TL;DR

The paper introduces uGMM-NN, a feedforward architecture in which every neuron outputs the log-density of a univariate Gaussian mixture, enabling multimodal representations and uncertainty quantification within deep networks. It demonstrates competitive discriminative performance on MNIST and Iris, and shows how probabilistic, interpretable activations can be integrated into CNNs by replacing dense layers with uGMM units. The approach links neural computation with probabilistic modeling, drawing connections to probabilistic circuits and interpretability frameworks, while highlighting potential for generative-inference extensions and uncertainty-aware design. Future work focuses on scalable MPE inference, extensions to sequential architectures, and sparsity-driven efficiency to enable larger-scale and multimodal applications.

Abstract

This paper introduces the Univariate Gaussian Mixture Model Neural Network (uGMM-NN), a novel neural architecture that embeds probabilistic reasoning directly into the computational units of deep networks. Unlike traditional neurons, which apply weighted sums followed by fixed non-linearities, each uGMM-NN node parameterizes its activations as a univariate Gaussian mixture, with learnable means, variances, and mixing coefficients. This design enables richer representations by capturing multimodality and uncertainty at the level of individual neurons, while retaining the scalability of standard feed-forward networks. We demonstrate that uGMM-NN can achieve competitive discriminative performance compared to conventional multilayer perceptrons, while additionally offering a probabilistic interpretation of activations. The proposed framework provides a foundation for integrating uncertainty-aware components into modern neural architectures, opening new directions for both discriminative and generative modeling.

uGMM-NN: Univariate Gaussian Mixture Model Neural Network

TL;DR

The paper introduces uGMM-NN, a feedforward architecture in which every neuron outputs the log-density of a univariate Gaussian mixture, enabling multimodal representations and uncertainty quantification within deep networks. It demonstrates competitive discriminative performance on MNIST and Iris, and shows how probabilistic, interpretable activations can be integrated into CNNs by replacing dense layers with uGMM units. The approach links neural computation with probabilistic modeling, drawing connections to probabilistic circuits and interpretability frameworks, while highlighting potential for generative-inference extensions and uncertainty-aware design. Future work focuses on scalable MPE inference, extensions to sequential architectures, and sparsity-driven efficiency to enable larger-scale and multimodal applications.

Abstract

This paper introduces the Univariate Gaussian Mixture Model Neural Network (uGMM-NN), a novel neural architecture that embeds probabilistic reasoning directly into the computational units of deep networks. Unlike traditional neurons, which apply weighted sums followed by fixed non-linearities, each uGMM-NN node parameterizes its activations as a univariate Gaussian mixture, with learnable means, variances, and mixing coefficients. This design enables richer representations by capturing multimodality and uncertainty at the level of individual neurons, while retaining the scalability of standard feed-forward networks. We demonstrate that uGMM-NN can achieve competitive discriminative performance compared to conventional multilayer perceptrons, while additionally offering a probabilistic interpretation of activations. The proposed framework provides a foundation for integrating uncertainty-aware components into modern neural architectures, opening new directions for both discriminative and generative modeling.

Paper Structure

This paper contains 16 sections, 7 equations, 3 figures, 2 tables.

Figures (3)

  • Figure 1: Schematic of a uGMM-NN neuron. Each neuron models a uGMM with $N$ components based on inputs from the previous layer, with each connection parameterized by mean ($\mu_{j,k}$), standard deviation ($\sigma_{j,k}$), and mixing weight ($w_{j,k}$), outputting the log probability $\log P_j(y)$.
  • Figure 2: Illustration of a single uGMM neuron with three univariate Gaussian mixture components. The means ($\mu_{j,k}$), variances ($\sigma_{j,k}^2$), and mixing weights ($w_{j,k}$) define the neuron's probability density $P_j(y)$. The dashed black line shows the combined mixture density.
  • Figure 3: Comparison of test loss convergence between MLP, uGMM-NN, and their LayerNorm-enhanced or CNN-augmented counterparts. The uGMM layers achieve comparable or superior final loss and stability, particularly when Layer Normalization is applied to stabilize the log-density activations.