Eigenvalue Corrected Noisy Natural Gradient
Juhan Bae, Guodong Zhang, Roger Grosse
TL;DR
The paper tackles uncertainty estimation in neural networks by extending Noisy K-FAC to a more expressive posterior. It introduces eigenvalue corrected matrix-variate Gaussian (EMVG) posteriors via Noisy EK-FAC, computing diagonal variance in the EK-FAC eigenbasis to increase posterior flexibility without sacrificing tractability. The method combines EK-FAC's diagonal correction with the noisy natural gradient framework, yielding improved ELBO and predictive performance on regression and classification benchmarks. This approach enhances practical Bayesian deep learning by delivering scalable, better-calibrated uncertainty estimates in large models.
Abstract
Variational Bayesian neural networks combine the flexibility of deep learning with Bayesian uncertainty estimation. However, inference procedures for flexible variational posteriors are computationally expensive. A recently proposed method, noisy natural gradient, is a surprisingly simple method to fit expressive posteriors by adding weight noise to regular natural gradient updates. Noisy K-FAC is an instance of noisy natural gradient that fits a matrix-variate Gaussian posterior with minor changes to ordinary K-FAC. Nevertheless, a matrix-variate Gaussian posterior does not capture an accurate diagonal variance. In this work, we extend on noisy K-FAC to obtain a more flexible posterior distribution called eigenvalue corrected matrix-variate Gaussian. The proposed method computes the full diagonal re-scaling factor in Kronecker-factored eigenbasis. Empirically, our approach consistently outperforms existing algorithms (e.g., noisy K-FAC) on regression and classification tasks.
