Table of Contents
Fetching ...

Measuring Neural Network Complexity via Effective Degrees of Freedom

Jia Zhou, Douglas Landsittel

Abstract

Quantifying the complexity of feed-forward neural networks (FFNNs) remains challenging due to their nonlinear, hierarchical structure and numerous parameters. We apply generalized degrees of freedom (GDF) to measure model complexity in FFNNs with binary outcomes, adapting the algorithm for discrete responses. We compare GDF with both the effective number of parameters derived via log-likelihood cross-validation and the null degrees of freedom of Landsittel et al. Through simulation studies and a real data analysis, we demonstrate that GDF provides a robust assessment of model complexity for neural network models, as it depends only on the sensitivity of fitted values to perturbations in the observed responses rather than on assumptions about the likelihood. In contrast, cross-validation-based estimates of model complexity and the null degrees of freedom rely on the correctness of the assumed likelihood and may exhibit substantial variability. We find that GDF, cross-validation-based measures, and null degrees of freedom yield similar assessments of model complexity only when the fitted model adequately represents the data-generating mechanism. These findings highlight GDF as a stable and broadly applicable measure of model complexity for neural networks in statistical modeling.

Measuring Neural Network Complexity via Effective Degrees of Freedom

Abstract

Quantifying the complexity of feed-forward neural networks (FFNNs) remains challenging due to their nonlinear, hierarchical structure and numerous parameters. We apply generalized degrees of freedom (GDF) to measure model complexity in FFNNs with binary outcomes, adapting the algorithm for discrete responses. We compare GDF with both the effective number of parameters derived via log-likelihood cross-validation and the null degrees of freedom of Landsittel et al. Through simulation studies and a real data analysis, we demonstrate that GDF provides a robust assessment of model complexity for neural network models, as it depends only on the sensitivity of fitted values to perturbations in the observed responses rather than on assumptions about the likelihood. In contrast, cross-validation-based estimates of model complexity and the null degrees of freedom rely on the correctness of the assumed likelihood and may exhibit substantial variability. We find that GDF, cross-validation-based measures, and null degrees of freedom yield similar assessments of model complexity only when the fitted model adequately represents the data-generating mechanism. These findings highlight GDF as a stable and broadly applicable measure of model complexity for neural networks in statistical modeling.
Paper Structure (12 sections, 18 equations, 3 figures, 2 tables)

This paper contains 12 sections, 18 equations, 3 figures, 2 tables.

Figures (3)

  • Figure 1: Comparison of vertical and horizontal GDF estimators ($\widehat{\mathrm{GDF}}$) and the cross-validation-based effective number of parameters ($\hat{p}_{CV}$) under three simulation scenarios (Scenarios 1 - 3). Horizontal $\widehat{\mathrm{GDF}}$ is evaluated for varying numbers of response inversions $k$, and $\hat{p}_{CV}$ is evaluated for varying numbers of folds $K$.
  • Figure 2: Generalized degrees of freedom (GDF) and effective number of parameters ($\hat{p}_{CV}$) under the true and intercept-only models for a neural network with three continuous inputs, where the red dotted curve the estimated GDF under the true model ($\widehat{\mathrm{GDF}}^{true}$), the green triangular curve the estimated GDF under the intercept-only model ($\widehat{\mathrm{GDF}}^{int}$), the cyan square curve the estimated $\hat{p}_{CV}$ under the true model ($\hat{p}_{CV}^{true}$), and the purple cross-marked curve the estimated $\hat{p}_{CV}$ under the intercept-only model ($\hat{p}_{CV}^{int}$). Panels are arranged by sample size ($n=200,~500,~1000$ from top to bottom) and number of hidden units ($H=2,~5,~10$ from left to right). Within each panel, results are shown for decay parameters $\lambda=0.01,~0.05,~0.1$.
  • Figure 3: Same as Figure \ref{['fig:true_vs_null:x3']}, but for a neural network with ten continuous inputs.