Marginalization Consistent Probabilistic Forecasting of Irregular Time Series via Mixture of Separable flows

Vijaya Krishna Yalavarthi; Randolf Scholz; Christian Kloetergens; Kiran Madhusudhanan; Stefan Born; Lars Schmidt-Thieme

Marginalization Consistent Probabilistic Forecasting of Irregular Time Series via Mixture of Separable flows

Vijaya Krishna Yalavarthi, Randolf Scholz, Christian Kloetergens, Kiran Madhusudhanan, Stefan Born, Lars Schmidt-Thieme

TL;DR

MOSES tackles probabilistic forecasting for irregular time series by enforcing marginalization consistency while preserving strong joint prediction. It builds a four-component model with a separable encoder, $D$ Gaussian sources with full covariance, $D$ separable normalizing flows, and a mixture over components, ensuring $\hat{p}(y\mid Q,X)$ is marginalization-consistent. Across four real datasets, MOSES achieves near-ProFITi performance on joint distributions but substantially superior marginal accuracy, with marginal consistency (low $D_{KL}$-based IPC) close to zero, demonstrating reliable, coherent probabilistic forecasts. This makes MOSES particularly suitable for decision-making in domains like weather and healthcare, where consistent marginals improve trust and applicability of forecasts.

Abstract

Probabilistic forecasting models for joint distributions of targets in irregular time series with missing values are a heavily under-researched area in machine learning, with, to the best of our knowledge, only two Models have been researched so far: The Gaussian Process Regression model, and ProFITi. While ProFITi, thanks to using multivariate normalizing flows, is very expressive, leading to better predictive performance, it suffers from marginalization inconsistency: It does not guarantee that the marginal distributions of a subset of variables in its predictive distributions coincide with the directly predicted distributions of these variables. When asked to directly predict marginal distributions, they are often vastly inaccurate. We propose MOSES (Marginalization Consistent Mixture of Separable Flows), a model that parametrizes a stochastic process through a mixture of several latent multivariate Gaussian Processes combined with separable univariate Normalizing Flows. In particular, MOSES can be analytically marginalized, allowing it to directly answer a wider range of probabilistic queries than most competitors. Experiments on four datasets show that MOSES achieves both accurate joint and marginal predictions, surpassing all other marginalization consistent baselines, while only trailing slightly behind ProFITi in joint prediction, but vastly superior when predicting marginal distributions.

Marginalization Consistent Probabilistic Forecasting of Irregular Time Series via Mixture of Separable flows

TL;DR

Gaussian sources with full covariance,

separable normalizing flows, and a mixture over components, ensuring

is marginalization-consistent. Across four real datasets, MOSES achieves near-ProFITi performance on joint distributions but substantially superior marginal accuracy, with marginal consistency (low

-based IPC) close to zero, demonstrating reliable, coherent probabilistic forecasts. This makes MOSES particularly suitable for decision-making in domains like weather and healthcare, where consistent marginals improve trust and applicability of forecasts.

Abstract

Paper Structure (40 sections, 4 theorems, 26 equations, 4 figures, 11 tables)

This paper contains 40 sections, 4 theorems, 26 equations, 4 figures, 11 tables.

Introduction
Preliminaries
Requirements.
Related Work
Constructing Marginalization Consistent Conditional Distributions
Separably Parametrized Gaussians.
Separable Normalizing Flows.
Conditional Mixtures of Flows.
Mixtures of Separable Flows (MOSES)
1. Separable Encoder.
2. $D$ separably parametrized Gaussian source distributions $p_{Z_d}(z \mid \mu_d, \Sigma_d)$.
3. $D$ separable normalizing flows $\hat{p}_d^\textnormal{\scshape flow}$.
4. Mixture Model.
Computational Complexities.
Training.
...and 25 more sections

Key Result

Theorem 2.1

Any model that satisfies req:joint_prediction-req:mar_consistency realizes an $\mathbb{R}$-valued stochastic process over the index set $T=\mathbb{R}\times\{1,\ldots,C\}$. Proof. This is a direct application of Kolmogorov's extension theorem oksendal_stochastic_differential_equations_2003

Figures (4)

Figure 1: (Top) Importance of multiple flow components: $\text{MOSES}(1)$ cannot represent the correct distribution, but $\text{MOSES}(4)$ can. (Bottom) Limitation of Gaussian Mixture Models: GMM needs 15 components to match the distribution of $\text{MOSES}(4)$.
Figure 2: Illustration of $\text{MOSES}$. $D$-many flows (fixed). $K$-many variables (variable). Encoder (enc) takes $X, Q$ (observed series and query timepoint-channel ids.) as input, and outputs an embedding $\mathbf{h}$ (depends on both $X$, and $Q$) and $w$ (depends on $X$ only). $\mu, \Sigma$ of $p_{Z_d}$ are parametrized by $\mathbf{h}_d$. Flow transformation of $p_{Z_d}$: parametrized by $\mathbf{h}_d$. Transformation layer: $K$-many univariate transformations $\phi$ that transforms $z_k$ of $z\sim p_{Z_d}(z\mid \mathbf{h}_D)$ to $y_k$ of $y\sim p_d^\textnormal{\scshape flow}(y\mid \mathbf{h}_d)$.
Figure 3: Demonstration of marginal consistency for $\text{MOSES}$ (ours), ProFITi Yalavarthi2024.Probabilistica, and Gaussian Process Regression Bonilla2007.Multitask on two toy datasets: blast (left) and circle (right). ProFITi is inconsistent w.r.t. the marginals of the second variable $y_2$, while $\text{MOSES}$ is consistent with the marginals of both $y_1$ and $y_2$. $\text{MOSES}(D)$ indicates $D$ mixture components. Gaussian Process Regression (GPR) is marginalization consistent but predicts incorrect distributions.
Figure 4: njNLL vs. $\mathop{\mathrm{MI}}\nolimits$. $\text{MOSES}$ is marginalization consistent within sampling error.

Theorems & Definitions (7)

Theorem 2.1
Lemma 4.1
Lemma 4.2
Theorem 5.1
proof
proof
proof

Marginalization Consistent Probabilistic Forecasting of Irregular Time Series via Mixture of Separable flows

TL;DR

Abstract

Marginalization Consistent Probabilistic Forecasting of Irregular Time Series via Mixture of Separable flows

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (7)