Table of Contents
Fetching ...

Exponential Family Discriminant Analysis: Generalizing LDA-Style Generative Classification to Non-Gaussian Models

Anish Lakkapragada

Abstract

We introduce Exponential Family Discriminant Analysis (EFDA), a unified generative framework that extends classical Linear Discriminant Analysis (LDA) beyond the Gaussian setting to any member of the exponential family. Under the assumption that each class-conditional density belongs to a common exponential family, EFDA derives closed-form maximum-likelihood estimators for all natural parameters and yields a decision rule that is linear in the sufficient statistic, recovering LDA as a special case and capturing nonlinear decision boundaries in the original feature space. We prove that EFDA is asymptotically calibrated and statistically efficient under correct specification, and we generalise it to $K \geq 2$ classes and multivariate data. Through extensive simulation across five exponential-family distributions (Weibull, Gamma, Exponential, Poisson, Negative Binomial), EFDA matches the classification accuracy of LDA, QDA, and logistic regression while reducing Expected Calibration Error (ECE) by $2$-$6\times$, a gap that is structural: it persists for all $n$ and across all class-imbalance levels, because misspecified models remain asymptotically miscalibrated. We further prove and empirically confirm that EFDA's log-odds estimator approaches the Cramér-Rao bound under correct specification, and is the only estimator in our comparison whose mean squared error converges to zero. Complete derivations are provided for nine distributions. Finally, we formally verify all four theoretical propositions in Lean 4, using Aristotle (Harmonic) and OpenGauss (Math, Inc.) as proof generators, with all outputs independently machine-checked by AXLE (Axiom).

Exponential Family Discriminant Analysis: Generalizing LDA-Style Generative Classification to Non-Gaussian Models

Abstract

We introduce Exponential Family Discriminant Analysis (EFDA), a unified generative framework that extends classical Linear Discriminant Analysis (LDA) beyond the Gaussian setting to any member of the exponential family. Under the assumption that each class-conditional density belongs to a common exponential family, EFDA derives closed-form maximum-likelihood estimators for all natural parameters and yields a decision rule that is linear in the sufficient statistic, recovering LDA as a special case and capturing nonlinear decision boundaries in the original feature space. We prove that EFDA is asymptotically calibrated and statistically efficient under correct specification, and we generalise it to classes and multivariate data. Through extensive simulation across five exponential-family distributions (Weibull, Gamma, Exponential, Poisson, Negative Binomial), EFDA matches the classification accuracy of LDA, QDA, and logistic regression while reducing Expected Calibration Error (ECE) by -, a gap that is structural: it persists for all and across all class-imbalance levels, because misspecified models remain asymptotically miscalibrated. We further prove and empirically confirm that EFDA's log-odds estimator approaches the Cramér-Rao bound under correct specification, and is the only estimator in our comparison whose mean squared error converges to zero. Complete derivations are provided for nine distributions. Finally, we formally verify all four theoretical propositions in Lean 4, using Aristotle (Harmonic) and OpenGauss (Math, Inc.) as proof generators, with all outputs independently machine-checked by AXLE (Axiom).
Paper Structure (45 sections, 4 theorems, 30 equations, 5 figures, 5 tables)

This paper contains 45 sections, 4 theorems, 30 equations, 5 figures, 5 tables.

Key Result

Proposition 1

Under Assumption (A), the EFDA estimators satisfy $\hat{\alpha}_k \to \alpha_k^*$ and $\hat{\eta}_k \to \eta_k^*$ almost surely as $n\to\infty$.

Figures (5)

  • Figure 1: ECE (%) by distribution and method ($n=1{,}000$, $M=100$ trials). EFDA achieves the lowest ECE in every distribution; QDA is dramatically miscalibrated on heavy-tailed data (Exponential, Gamma).
  • Figure 2: ECE vs. training size $n$ (Weibull, $M=100$ trials). The ECE gap between EFDA and LDA/LR is constant across $n$, indicating structural miscalibration of misspecified models.
  • Figure 3: ECE vs. class prior $\alpha$ (Weibull, $n=1{,}000$, $M=100$ trials). EFDA remains well-calibrated ($\leq 2\%$) across all imbalance levels; LDA and LR are $2$--$3\times$ worse.
  • Figure 4: Left: mean variance of $\hat{\ell}(x_0)$ (log $x$-axis, linear $y$). All methods' variances decay to zero; EFDA tracks the CR bound. Right: mean MSE (log-log), revealing the misspecification residual of Proposition \ref{['prop:mse']}. LDA, LR, and QDA plateau as their variance vanishes but their squared bias remains; only EFDA's MSE continues toward zero. ($M=1{,}000$ trials, Weibull shape $k'=3$, $\lambda_0=4$, $\lambda_1=2$; $x_0$ grid of 100 points sampled from both class-conditional distributions; $N_0=\lfloor n(1-\alpha)\rfloor$, $N_1=\lfloor n\alpha\rfloor$ fixed per trial.)
  • Figure 5: Accuracy vs. $n$ for EFDA (known and estimated $k$), LDA, and LR on Weibull data with true $k=3$ ($M=100$ trials). Estimating $k$ incurs $<1\%$ accuracy loss and stabilises rapidly by $n\approx250$.

Theorems & Definitions (9)

  • Remark 1
  • Proposition 1: Consistency of the EFDA MLE
  • proof
  • Proposition 2: Calibration
  • proof
  • Proposition 3: MLE efficiency
  • proof
  • Proposition 4: Asymptotic MSE under misspecification
  • proof