Table of Contents
Fetching ...

Wasserstein multivariate auto-regressive models for modeling distributional time series

Yiye Jiang, Jérémie Bigot

TL;DR

The paper addresses modeling time-indexed distributions observed across multiple series by embedding distributional data in the Wasserstein space and proposing a Wasserstein multivariate autoregressive (WMAR) model. It develops an IRF-based theoretical foundation to guarantee existence, uniqueness, and second-order stationarity, and introduces a sparse, constrained estimator that enables learning a temporal dependency graph among series. The authors provide a practical centering strategy, a quantile-function representation, and a consistent estimation procedure, validated through simulations and real-data applications to age distributions and Paris bike-sharing. This framework enables interpretable cross-series dependency analysis and scalable distributional time-series modeling with rigorous statistical guarantees.

Abstract

This paper is focused on the statistical analysis of data consisting of a collection of multiple series of probability measures that are indexed by distinct time instants and supported over a bounded interval of the real line. By modeling these time-dependent probability measures as random objects in the Wasserstein space, we propose a new auto-regressive model for the statistical analysis of multivariate distributional time series. Using the theory of iterated random function systems, results on the second order stationarity of the solution of such a model are provided. We also propose a consistent estimator for the auto-regressive coefficients of this model. Due to the simplex constraints that we impose on the model coefficients, the proposed estimator that is learned under these constraints, naturally has a sparse structure. The sparsity allows the application of the proposed model in learning a graph of temporal dependency from multivariate distributional time series. We explore the numerical performances of our estimation procedure using simulated data. To shed some light on the benefits of our approach for real data analysis, we also apply this methodology to two data sets, respectively made of observations from age distribution in different countries and those from the bike sharing network in Paris.

Wasserstein multivariate auto-regressive models for modeling distributional time series

TL;DR

The paper addresses modeling time-indexed distributions observed across multiple series by embedding distributional data in the Wasserstein space and proposing a Wasserstein multivariate autoregressive (WMAR) model. It develops an IRF-based theoretical foundation to guarantee existence, uniqueness, and second-order stationarity, and introduces a sparse, constrained estimator that enables learning a temporal dependency graph among series. The authors provide a practical centering strategy, a quantile-function representation, and a consistent estimation procedure, validated through simulations and real-data applications to age distributions and Paris bike-sharing. This framework enables interpretable cross-series dependency analysis and scalable distributional time-series modeling with rigorous statistical guarantees.

Abstract

This paper is focused on the statistical analysis of data consisting of a collection of multiple series of probability measures that are indexed by distinct time instants and supported over a bounded interval of the real line. By modeling these time-dependent probability measures as random objects in the Wasserstein space, we propose a new auto-regressive model for the statistical analysis of multivariate distributional time series. Using the theory of iterated random function systems, results on the second order stationarity of the solution of such a model are provided. We also propose a consistent estimator for the auto-regressive coefficients of this model. Due to the simplex constraints that we impose on the model coefficients, the proposed estimator that is learned under these constraints, naturally has a sparse structure. The sparsity allows the application of the proposed model in learning a graph of temporal dependency from multivariate distributional time series. We explore the numerical performances of our estimation procedure using simulated data. To shed some light on the benefits of our approach for real data analysis, we also apply this methodology to two data sets, respectively made of observations from age distribution in different countries and those from the bike sharing network in Paris.
Paper Structure (41 sections, 19 theorems, 160 equations, 17 figures, 1 table)

This paper contains 41 sections, 19 theorems, 160 equations, 17 figures, 1 table.

Key Result

Proposition 2.1

Define the probability measure $\gamma_\alpha, \, \alpha \in [0,1]$ by Then $d_W(\gamma, \gamma_{\alpha}) = \alpha d_W(\gamma, \mu),$ where $d_W$ is the Wasserstein distance of $\mathcal{W}_2({\rm I\!R})$.

Figures (17)

  • Figure 1: Annual records of age distributions of EU countries. On the top are $27$ countries in the European union. A sequence of age distribution is recorded at each country over years. For example, at the bottom we illustrate the sequence of France, where one distribution supported over $[0,1]$ is observed at each year. On the lower left, we visualize the resulting univariate distributional time series with a surface in the coordinate system of Age $\times$ Year $\times$ Relative frequency. The raw data in this plot consist in $40$ annual distributions. We complete them with interpolated samples to draw the surface. On the lower right, we show the projection of the raw time series onto the Age $\times$ Relative frequency plane. We can see that the population is aging along time.
  • Figure 2: Geodesic in $\mathcal{W}_2({\rm I\!R})$.
  • Figure 3: Geometric interpretation of standard univariate $AR$ model \ref{['eq:ar']}.
  • Figure 4: Geometric interpretation of Wasserstein multivariate AR model \ref{['eq: multi_AR_Wass']}. The figure corresponds to a simple regression formula with $N=2$. The red point represents the value of $\mathop{\mathrm{Exp}}\nolimits_{Leb}\left\{ 0.6(\widetilde{\bm T}^{1,t-1}_{Leb}-id) + 0.2(\widetilde{\bm T}^{2,t-1}_{Leb}-id)\right\}$, which is closer to $\widetilde{\bm \mu}_{1,t-1}$ due to a larger regression coefficient.
  • Figure 5: Centering method: $\bm F_{\widetilde{\bm{\mu}}}^{-1} = \bm F_{ \bm{\mu} }^{-1} \circ F_{\oplus}$. On the left are realizations of random function $\bm F^{-1}_{\bm\mu}$, denoted by $F_t^{-1}, t =1, ..., 10$. On the middle is $F_{\oplus}^{-1}$. On the right are $F_t^{-1}\circ F_{\oplus}, t =1, ..., 10$, representing the realizations of $\bm F_{\widetilde{\bm{\mu}}}^{-1}$. We can see that the point-wise mean in the right subfigure becomes the identity function.
  • ...and 12 more figures

Theorems & Definitions (31)

  • Definition 1.1
  • Definition 2.1
  • Definition 2.2
  • Definition 2.2
  • Definition 2.3
  • Definition 2.4
  • Proposition 2.1
  • Definition 2.5
  • Proposition 4.1
  • Definition 4.1
  • ...and 21 more