Table of Contents
Fetching ...

Predictive Distributions and the Transition from Sparse to Dense Functional Data

Álvaro Gajardo, Xiongtao Dai, Hans-Georg Müller

Abstract

A representation of Gaussian distributed sparsely sampled longitudinal data in terms of predictive distributions for their functional principal component scores (FPCs) maps available data for each subject to a multivariate Gaussian predictive distribution. Of special interest is the case where the number of observations per subject increases in the transition from sparse (longitudinal) to dense (functional) sampling of underlying stochastic processes. We study the convergence of the predicted scores given noisy longitudinal observations towards the true but unobservable FPCs, and under Gaussianity demonstrate the shrinkage of the entire predictive distribution towards a point mass located at the true FPCs and also extensions to the shrinkage of functional $K$-truncated predictive distributions when the truncation point $K=K(n)$ diverges with sample size $n$. To address the problem of non-consistency of point predictions, we construct predictive distributions aimed at predicting outcomes for the case of sparsely sampled longitudinal predictors in functional linear models and derive asymptotic rates of convergence for the $2$-Wasserstein metric between true and estimated predictive distributions. Predictive distributions are illustrated for longitudinal data from the Baltimore Longitudinal Study of Aging.

Predictive Distributions and the Transition from Sparse to Dense Functional Data

Abstract

A representation of Gaussian distributed sparsely sampled longitudinal data in terms of predictive distributions for their functional principal component scores (FPCs) maps available data for each subject to a multivariate Gaussian predictive distribution. Of special interest is the case where the number of observations per subject increases in the transition from sparse (longitudinal) to dense (functional) sampling of underlying stochastic processes. We study the convergence of the predicted scores given noisy longitudinal observations towards the true but unobservable FPCs, and under Gaussianity demonstrate the shrinkage of the entire predictive distribution towards a point mass located at the true FPCs and also extensions to the shrinkage of functional -truncated predictive distributions when the truncation point diverges with sample size . To address the problem of non-consistency of point predictions, we construct predictive distributions aimed at predicting outcomes for the case of sparsely sampled longitudinal predictors in functional linear models and derive asymptotic rates of convergence for the -Wasserstein metric between true and estimated predictive distributions. Predictive distributions are illustrated for longitudinal data from the Baltimore Longitudinal Study of Aging.

Paper Structure

This paper contains 15 sections, 25 theorems, 323 equations, 6 figures, 2 tables.

Key Result

Proposition 1

Suppose that a:fbelow--a:GammaDiff hold and the number of observations $n_i$ for the $i$th subject satisfies $n_i=m\to\infty$, $i=1,\dots,n$. Then, for any fixed $K\ge 1$, $k=1,\dots, K$, and $i=1,\dots,n$, as $m \rightarrow \infty$,

Figures (6)

  • Figure 1: The $95\%$ contours for $10$ predictive distributions for the joint distribution of the first two functional principal components with $K=2$ obtained by random sampling of the data of a new subject when varying the number of observations $n_i$ per subject in the transition from sparse to dense, for $n_i=2$ (very sparse; left panel), $n_i=10$ (medium sparse; middle panel), and $n_i=50$ (dense; right panel), for error variance $\sigma=0.5^2$ and eigenfunctions $\phi_1(t)=-\cos(\pi t/10)/\sqrt{5}$, $\phi_2(t)=\sin(\pi t/10)/\sqrt{5}$, $\mu(t)=t+\sin(t)$, $t\in\mathcal{T}=[0,10]$. The time points are sampled from a uniform distribution on $\mathcal{T}$. As expected, the predictive distributions shrink towards a point mass located at the true unobserved functional principal components (black dot) as the data gets denser. The colored dots correspond to the centers of the simulated predictive distributions.
  • Figure 2: Simulation results illustrating Propositions \ref{['thm:xiEst']} and \ref{['thm:xiEst2']} with $K=2$. The upper panel shows boxplots across $200$ simulations of the error term $||\tilde{\boldsymbol{\xi}}_{iK} - \boldsymbol{\xi}_{iK} ||_2$ for very sparse ($m=2$, left), less sparse ($m=10$, middle) and more dense ($m=50$, right) designs. The lower panel shows the corresponding results for $||\boldsymbol{\Sigma}_{iK} ||_\text{op,2}$.
  • Figure 3: Boxplots of the true underlying Wasserstein discrepancy measure $\mathcal{D}_{nK}$\ref{['14']} in the functional linear model for $1000$ simulations and sample size $n=500$, for increasingly less sparse sampling designs and various noise levels for the predictor process $X$ and response $Y$.
  • Figure 4: Predictive distributions $\mathcal{P}_{K}$ for the response in the functional linear model obtained by simulating different sampling design scenarios for a given realization of the predictor process $X$, for very sparse $m=2$ (blue), sparse $m=8$ (green), less sparse $m=20$ (orange) and dense design $m=100$ (red), with $\sigma=\sigma_Y=0.5$. The vertical line corresponds to the (unobserved) predictable part $\eta_K$ of the response.
  • Figure 5: The first three estimated eigenfunctions reflecting the main modes of variation in the sample of sparsely observed BMI functional data from the Baltimore Longitudinal Study of Aging.
  • ...and 1 more figures

Theorems & Definitions (50)

  • Proposition 1
  • Theorem 1
  • Proposition 2
  • Theorem 2
  • Theorem 3
  • Theorem 4
  • Theorem 5
  • Theorem 6
  • Theorem 7
  • proof : Proof of Proposition \ref{['thm:xiEst']}
  • ...and 40 more