Extended Fiducial Inference: Toward an Automated Process of Statistical Inference
Faming Liang, Sehwan Kim, Yan Sun
TL;DR
EFI reframes statistical inference by treating unknown parameters as fixed yet propagating data uncertainty through an learned inverse mapping $G({\boldsymbol Y},{\boldsymbol X},{\boldsymbol Z})$ via a sparse DNN, while imputing latent errors ${\boldsymbol Z}_n$ with adaptive stochastic gradient MCMC. The framework delivers a conditional fiducial distribution that automates hypothesis testing and parameter estimation without priors, and extends naturally to semi-supervised learning and complex hypotheses. The EFI-DNN algorithm provides a scalable, end-to-end method with convergence guarantees for the learned inverse and latent imputation, yielding robust fidelity to data, especially in the presence of outliers. Across linear and nonlinear models, Behrens–Fisher settings, multivariate norms, SSL tasks, and mediation tests, EFI demonstrates competitive or superior uncertainty quantification, reduced CI widths, and automated inference without relying on asymptotic references. Overall, EFI offers a flexible, data-driven pathway toward automated statistical inference with broad applicability in modern data science.
Abstract
While fiducial inference was widely considered a big blunder by R.A. Fisher, the goal he initially set --`inferring the uncertainty of model parameters on the basis of observations' -- has been continually pursued by many statisticians. To this end, we develop a new statistical inference method called extended Fiducial inference (EFI). The new method achieves the goal of fiducial inference by leveraging advanced statistical computing techniques while remaining scalable for big data. EFI involves jointly imputing random errors realized in observations using stochastic gradient Markov chain Monte Carlo and estimating the inverse function using a sparse deep neural network (DNN). The consistency of the sparse DNN estimator ensures that the uncertainty embedded in observations is properly propagated to model parameters through the estimated inverse function, thereby validating downstream statistical inference. Compared to frequentist and Bayesian methods, EFI offers significant advantages in parameter estimation and hypothesis testing. Specifically, EFI provides higher fidelity in parameter estimation, especially when outliers are present in the observations; and eliminates the need for theoretical reference distributions in hypothesis testing, thereby automating the statistical inference process. EFI also provides an innovative framework for semi-supervised learning.
