Table of Contents
Fetching ...

Probabilistic Neural Networks (PNNs) with t-Distributed Outputs: Adaptive Prediction Intervals Beyond Gaussian Assumptions

Farhad Pourkamali-Anaraki

TL;DR

This work tackles the problem of uncertainty quantification in regression under non-Gaussian output distributions. It introduces TDistrNN, a Probabilistic Neural Network that outputs a three-parameter Student's $t$-distribution ($f_\mu$, $f_\sigma$, $f_\nu$) to capture location, scale, and tail heaviness, and derives a tailored negative log-likelihood loss with analytical gradients for backpropagation. By enabling adaptive tail behavior through the degrees of freedom $f_\nu$, the approach yields prediction intervals that are narrower than Gaussian-based methods while preserving target coverage, as demonstrated on synthetic data and real-world benchmarks (Concrete Compressive Strength, Energy Efficiency, and Student Performance). The results highlight improved uncertainty quantification in regression tasks with heavy-tailed and outlier-prone data, offering a robust alternative to Gaussian assumptions for practical decision-making.

Abstract

Traditional neural network regression models provide only point estimates, failing to capture predictive uncertainty. Probabilistic neural networks (PNNs) address this limitation by producing output distributions, enabling the construction of prediction intervals. However, the common assumption of Gaussian output distributions often results in overly wide intervals, particularly in the presence of outliers or deviations from normality. To enhance the adaptability of PNNs, we propose t-Distributed Neural Networks (TDistNNs), which generate t-distributed outputs, parameterized by location, scale, and degrees of freedom. The degrees of freedom parameter allows TDistNNs to model heavy-tailed predictive distributions, improving robustness to non-Gaussian data and enabling more adaptive uncertainty quantification. We develop a novel loss function tailored for the t-distribution and derive efficient gradient computations for seamless integration into deep learning frameworks. Empirical evaluations on synthetic and real-world data demonstrate that TDistNNs improve the balance between coverage and interval width. Notably, for identical architectures, TDistNNs consistently produce narrower prediction intervals than Gaussian-based PNNs while maintaining proper coverage. This work contributes a flexible framework for uncertainty estimation in neural networks tasked with regression, particularly suited to settings involving complex output distributions.

Probabilistic Neural Networks (PNNs) with t-Distributed Outputs: Adaptive Prediction Intervals Beyond Gaussian Assumptions

TL;DR

This work tackles the problem of uncertainty quantification in regression under non-Gaussian output distributions. It introduces TDistrNN, a Probabilistic Neural Network that outputs a three-parameter Student's -distribution (, , ) to capture location, scale, and tail heaviness, and derives a tailored negative log-likelihood loss with analytical gradients for backpropagation. By enabling adaptive tail behavior through the degrees of freedom , the approach yields prediction intervals that are narrower than Gaussian-based methods while preserving target coverage, as demonstrated on synthetic data and real-world benchmarks (Concrete Compressive Strength, Energy Efficiency, and Student Performance). The results highlight improved uncertainty quantification in regression tasks with heavy-tailed and outlier-prone data, offering a robust alternative to Gaussian assumptions for practical decision-making.

Abstract

Traditional neural network regression models provide only point estimates, failing to capture predictive uncertainty. Probabilistic neural networks (PNNs) address this limitation by producing output distributions, enabling the construction of prediction intervals. However, the common assumption of Gaussian output distributions often results in overly wide intervals, particularly in the presence of outliers or deviations from normality. To enhance the adaptability of PNNs, we propose t-Distributed Neural Networks (TDistNNs), which generate t-distributed outputs, parameterized by location, scale, and degrees of freedom. The degrees of freedom parameter allows TDistNNs to model heavy-tailed predictive distributions, improving robustness to non-Gaussian data and enabling more adaptive uncertainty quantification. We develop a novel loss function tailored for the t-distribution and derive efficient gradient computations for seamless integration into deep learning frameworks. Empirical evaluations on synthetic and real-world data demonstrate that TDistNNs improve the balance between coverage and interval width. Notably, for identical architectures, TDistNNs consistently produce narrower prediction intervals than Gaussian-based PNNs while maintaining proper coverage. This work contributes a flexible framework for uncertainty estimation in neural networks tasked with regression, particularly suited to settings involving complex output distributions.

Paper Structure

This paper contains 8 sections, 23 equations, 9 figures, 1 table, 1 algorithm.

Figures (9)

  • Figure 1: This plot compares the tail behavior of standard Gaussian and t-distributions, showing realizations of a target variable (x-axis) and shaded 0.10 tail probability regions (0.05 per tail). It highlights the t-distribution's heavier tails and its enhanced ability to accommodate extreme values and model mismatch.
  • Figure 2: We compare the architectures and methods for generating 90% confidence level prediction intervals using: (a) Quantile neural networks (QuantileNN), and (b) Gaussian-based neural networks (GaussianNN). QuantileNN employs a single output neuron to estimate the quantile at level $\tau$, optimized with the pinball loss. GaussianNN, in contrast, directly predicts the mean and variance through two output neurons, trained with the Gaussian negative log-likelihood (GaussianNLL) loss. The lower and upper bounds of the resulting prediction intervals are shown.
  • Figure 3: This visualization shows the output layer modifications implemented to define t-Distributed Neural Networks (TDistNNs), ensuring the scale ($f_\sigma$) and degrees of freedom ($f_\nu$) parameters are appropriately constrained, and $f_\mu$ represents the point prediction.
  • Figure 4: Using a synthetic data set with outliers, we compare the performance of our proposed TDistNN against QuantileNN and GaussianNN. (a) shows the constructed prediction intervals, and (b) displays the histogram plot of degrees of freedom $f_{\nu}$ of the fitted t-distribution.
  • Figure 5: Boxplots illustrate (a) coverage score and (b) average interval width for different architectures across 20 trials on the synthetic data set with outliers. TDistNNs achieve the target coverage level of 90% with noticeably narrower prediction intervals than GaussianNNs, attributed to its modeling of heavier-tailed distributions.
  • ...and 4 more figures