Table of Contents
Fetching ...

Bayesian Neural Networks for Functional ANOVA model

Seokhun Park, Choeun Kim, Jihu Lee, Yunseop Shin, Insung Kong, Yongdai Kim

TL;DR

This work targets interpretability in high-dimensional function estimation by extending functional ANOVA with a Bayesian tensor product neural network (Bayesian-TPNN). It treats both the network architecture (the number of nodes $K$ and the active variable sets $S_k$) and the network weights as random, enabling joint inference over higher-order interactions via a carefully designed MCMC procedure that grows or prunes architectures while reusing existing components. The authors prove posterior consistency for each ANOVA component and demonstrate competitive predictive performance alongside improved uncertainty quantification across real and synthetic benchmarks, with compelling interpretability through component plots and explanations in concept bottleneck models. The framework scales better to higher orders than existing ANOVA-TPNN approaches and offers practical benefits in CBMs, providing a principled path to capturing complex interactions in interpretable machine learning models.

Abstract

With the increasing demand for interpretability in machine learning, functional ANOVA decomposition has gained renewed attention as a principled tool for breaking down high-dimensional function into low-dimensional components that reveal the contributions of different variable groups. Recently, Tensor Product Neural Network (TPNN) has been developed and applied as basis functions in the functional ANOVA model, referred to as ANOVA-TPNN. A disadvantage of ANOVA-TPNN, however, is that the components to be estimated must be specified in advance, which makes it difficult to incorporate higher-order TPNNs into the functional ANOVA model due to computational and memory constraints. In this work, we propose Bayesian-TPNN, a Bayesian inference procedure for the functional ANOVA model with TPNN basis functions, enabling the detection of higher-order components with reduced computational cost compared to ANOVA-TPNN. We develop an efficient MCMC algorithm and demonstrate that Bayesian-TPNN performs well by analyzing multiple benchmark datasets. Theoretically, we prove that the posterior of Bayesian-TPNN is consistent.

Bayesian Neural Networks for Functional ANOVA model

TL;DR

This work targets interpretability in high-dimensional function estimation by extending functional ANOVA with a Bayesian tensor product neural network (Bayesian-TPNN). It treats both the network architecture (the number of nodes and the active variable sets ) and the network weights as random, enabling joint inference over higher-order interactions via a carefully designed MCMC procedure that grows or prunes architectures while reusing existing components. The authors prove posterior consistency for each ANOVA component and demonstrate competitive predictive performance alongside improved uncertainty quantification across real and synthetic benchmarks, with compelling interpretability through component plots and explanations in concept bottleneck models. The framework scales better to higher orders than existing ANOVA-TPNN approaches and offers practical benefits in CBMs, providing a principled path to capturing complex interactions in interpretable machine learning models.

Abstract

With the increasing demand for interpretability in machine learning, functional ANOVA decomposition has gained renewed attention as a principled tool for breaking down high-dimensional function into low-dimensional components that reveal the contributions of different variable groups. Recently, Tensor Product Neural Network (TPNN) has been developed and applied as basis functions in the functional ANOVA model, referred to as ANOVA-TPNN. A disadvantage of ANOVA-TPNN, however, is that the components to be estimated must be specified in advance, which makes it difficult to incorporate higher-order TPNNs into the functional ANOVA model due to computational and memory constraints. In this work, we propose Bayesian-TPNN, a Bayesian inference procedure for the functional ANOVA model with TPNN basis functions, enabling the detection of higher-order components with reduced computational cost compared to ANOVA-TPNN. We develop an efficient MCMC algorithm and demonstrate that Bayesian-TPNN performs well by analyzing multiple benchmark datasets. Theoretically, we prove that the posterior of Bayesian-TPNN is consistent.

Paper Structure

This paper contains 106 sections, 5 theorems, 150 equations, 6 figures, 22 tables, 1 algorithm.

Key Result

Theorem 1

Any real-valued function $f$ defined on $\mathbb{R}^{p}$ can be uniquely decomposed as almost everywhere with respect to $\Pi_{j=1}^{p}\mu_{j},$ where each component $f_{S}$ satisfies the sum-to-zero condition with respect to $\mu$.

Figures (6)

  • Figure 1: Bayesian-TPNN with $p=4$ and $K=5$.
  • Figure 2: Plots of the functional relations of the important main effects estimated by Bayesian-TPNN on the Boston dataset. Each plot shows the Bayes estimate and 95% credible interval of each component. Labels indicate the names of the input variables along with the normalized importance scores.
  • Figure 3: Plots of the number of basis $K$ and RMSEs on various $C_{0}$ values.
  • Figure 4: Scatter Plots between the true expectations and estimated ones.
  • Figure 5: Examples of images misclassified by Linear model.
  • ...and 1 more figures

Theorems & Definitions (6)

  • Theorem 1: Functional ANOVA Decomposition hooker2007generalizedmcbook
  • Remark 2
  • Theorem 3: Posterior Consistency of Bayesian-TPNN
  • Theorem 4: Posterior Consistency of Bayesian-TPNN
  • Lemma 1
  • Lemma 2: Theorem 19.3 of gyorfi2006distribution