Table of Contents
Fetching ...

Exact and general decoupled solutions of the LMC Multitask Gaussian Process model

Olivier Truffinet, Karim Ammar, Jean-Philippe Argaud, Bertrand Bouriquet

TL;DR

The projected LMC appears as a credible and simpler alternative to state-of-the art models, which greatly facilitates some computations such as leave-one-out cross-validation and fantasization.

Abstract

The Linear Model of Co-regionalization (LMC) is a very general model of multitask gaussian process for regression or classification. While its expressivity and conceptual simplicity are appealing, naive implementations have cubic complexity in the number of datapoints and number of tasks, making approximations mandatory for most applications. However, recent work has shown that under some conditions the latent processes of the model can be decoupled, leading to a complexity that is only linear in the number of said processes. We here extend these results, showing from the most general assumptions that the only condition necessary to an efficient exact computation of the LMC is a mild hypothesis on the noise model. We introduce a full parametrization of the resulting \emph{projected LMC} model, and an expression of the marginal likelihood enabling efficient optimization. We perform a parametric study on synthetic data to show the excellent performance of our approach, compared to an unrestricted exact LMC and approximations of the latter. Overall, the projected LMC appears as a credible and simpler alternative to state-of-the art models, which greatly facilitates some computations such as leave-one-out cross-validation and fantasization.

Exact and general decoupled solutions of the LMC Multitask Gaussian Process model

TL;DR

The projected LMC appears as a credible and simpler alternative to state-of-the art models, which greatly facilitates some computations such as leave-one-out cross-validation and fantasization.

Abstract

The Linear Model of Co-regionalization (LMC) is a very general model of multitask gaussian process for regression or classification. While its expressivity and conceptual simplicity are appealing, naive implementations have cubic complexity in the number of datapoints and number of tasks, making approximations mandatory for most applications. However, recent work has shown that under some conditions the latent processes of the model can be decoupled, leading to a complexity that is only linear in the number of said processes. We here extend these results, showing from the most general assumptions that the only condition necessary to an efficient exact computation of the LMC is a mild hypothesis on the noise model. We introduce a full parametrization of the resulting \emph{projected LMC} model, and an expression of the marginal likelihood enabling efficient optimization. We perform a parametric study on synthetic data to show the excellent performance of our approach, compared to an unrestricted exact LMC and approximations of the latter. Overall, the projected LMC appears as a credible and simpler alternative to state-of-the art models, which greatly facilitates some computations such as leave-one-out cross-validation and fantasization.
Paper Structure (58 sections, 18 theorems, 37 equations, 7 figures, 5 tables)

This paper contains 58 sections, 18 theorems, 37 equations, 7 figures, 5 tables.

Key Result

Proposition 1

The posterior $p(\mathbf{U_{v}}|\mathbf{Y})$ of the latent processes $\mathbf{U}$ at the training points is gaussian with mean and variance:

Figures (7)

  • Figure 1: RMSE of several models for increasing data noise magnitude, with highly structured noise ($\mu_{str}=0.99$). Averaged over $N_{rep}=40$ random datasets
  • Figure 2: RMSE of several models for increasing proportion of structured noise, with fixed noise magnitude $\mu_{noise}=0.1$. Averaged over $N_{rep}=20$ random datasets
  • Figure 3: Average Predictive Variance Adequacy of several models for increasing number of noise latent processes $q_{noise}$, with $\mu_{noise}=0.1$, $\mu_{str}=0.9$, and $N_{rep}=10$.
  • Figure 4: Training duration of several models for increasing number of tasks and a fixed number $q=25$ of latent processes. Averaged over $N_{rep}=20$ random datasets
  • Figure 5: Duration of a training iteration for several models, with a) increasing number of tasks and fixed number of latent processes , and b) the opposite. Averaged over $N_{rep}=20$ random datasets
  • ...and 2 more figures

Theorems & Definitions (39)

  • Proposition 1
  • Remark 1
  • Proposition 2
  • Definition 1
  • Proposition 3
  • Definition 2
  • Proposition 4
  • Remark 2
  • Proposition 5
  • Lemma 1
  • ...and 29 more