Dynamic Online Ensembles of Basis Expansions

Daniel Waxman; Petar M. Djurić

Dynamic Online Ensembles of Basis Expansions

Daniel Waxman, Petar M. Djurić

TL;DR

The paper advances online Bayesian learning by generalizing online GP ensembling to arbitrary linear basis expansions (OEBE), enabling ensembles across diverse models from GAM-HSGPs to RBF networks. It introduces dynamic extensions (DOEBE, SDOEBE) with random walks on parameters and weights, and proposes E-DOEBE to mix dynamic and static models while avoiding weight-collapse. Theoretical regret analyses adapt existing IE-GP bounds to the OEBE framework, and empirical results across multiple real and synthetic datasets show that no single basis dominates; ensembles of diverse bases, including additive Hilbert-space GP constructions, often outperform standard RFF-based approaches. The work highlights practical strategies for hyperparameter sampling, non-Gaussian likelihood inference via Laplace approximation, and the value of additive models in high-dimensional settings, with significant implications for real-time, adaptable Bayesian modeling and online decision-making.

Abstract

Practical Bayesian learning often requires (1) online inference, (2) dynamic models, and (3) ensembling over multiple different models. Recent advances have shown how to use random feature approximations to achieve scalable, online ensembling of Gaussian processes with desirable theoretical properties and fruitful applications. One key to these methods' success is the inclusion of a random walk on the model parameters, which makes models dynamic. We show that these methods can be generalized easily to any basis expansion model and that using alternative basis expansions, such as Hilbert space Gaussian processes, often results in better performance. To simplify the process of choosing a specific basis expansion, our method's generality also allows the ensembling of several entirely different models, for example, a Gaussian process and polynomial regression. Finally, we propose a novel method to ensemble static and dynamic models together.

Dynamic Online Ensembles of Basis Expansions

TL;DR

Abstract

Paper Structure (46 sections, 1 theorem, 44 equations, 14 figures, 4 tables)

This paper contains 46 sections, 1 theorem, 44 equations, 14 figures, 4 tables.

Introduction
Background
Linear Basis Expansions
Predictive Distribution and Online Estimation
Fitting Hyperparameters with Empirical Bayes
Examples of Basis Functions
Gaussian Process Regression
Predictive Distribution
The Choice of Mean and Kernel
Sparse Spectrum and Hilbert Space Approximations
Random Fourier Feature Gaussian Processes
Hilbert Space Gaussian Processes
Bayesian Model Averaging
Dynamic Online Ensembles of Basis Expansions
Online Ensembles of Basis Expansions
...and 31 more sections

Key Result

Theorem 1

Let the negative log-likelihood $\mathcal{L}(\cdot; y_\tau)$ be $\mathcal{C}^2$ with bounded second derivative, i.e., $\lvert \frac{d^2}{dz_\tau} \mathcal{L}(z_\tau; y_\tau) \rvert \leq c$ for all $z_\tau$ and some constant $c$. Let us consider an OEBE with prior where $F_m = \dim {\bm{\theta}}^{(m)}$. Furthermore, assume that for any ${\mathbf{x}}$, the norm of $\phi^{(m)}({\mathbf{x}})$ is boun

Figures (14)

Figure 1: The hierarchy of the proposed models. All models have the same distribution for $y_t$ conditioned on ${\bm{\theta}}_t$, while DOEBE adds a random walk to parameters, and SDOEBE adds a random walk to BMA weights.
Figure 2: Results of Experiment 1. Pictured are the nMSE (lower is better) and PLL (higher is better) with error bars denoting one standard deviation over 10.0 random trials. The best-performing method on each dataset and metric is highlighted with bold edges --- to preserve readability, the nMSE axis was bounded at $+1.0$ and the PLL axis is bound at $\pm 2.5$, even if points extend past this.
Figure 3: Results of Experiment 2. Pictured is the 10-sample moving average of the BMA weight corresponding the dynamic additive HSGP when trained as a DOEBE and E-DOEBE.
Figure 4: Results of Experiment 3. Pictured are the nMSE (lower is better) and PLL (higher is better) with error bars denoting one standard deviation over 10.0 random trials. The best-performing method on each dataset and metric is highlighted with bold edges --- to preserve readability, the nMSE axis was bounded at $+1.0$ and the PLL axis is bound at $\pm 2.5$, even if points extend past this.
Figure 5: (Left) Classification errors for the shuffled banana dataset. Lines indicate the mean over 5.0 trials, with shaded regions denoting $\pm 1$ standard deviation. (Right) The predictive distribution of an OE-RFF ensemble after training on the shuffled banana dataset.
...and 9 more figures

Theorems & Definitions (3)

Theorem 1: Online Regret Bound for OEBE
proof
proof : Proof of \ref{['thm:online_bound']}

Dynamic Online Ensembles of Basis Expansions

TL;DR

Abstract

Dynamic Online Ensembles of Basis Expansions

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (14)

Theorems & Definitions (3)