Temporal Gaussian Copula For Clinical Multivariate Time Series Data Imputation
Ye Su, Hezhe Qiao, Di Wu, Yuwen Chen, Lin Chen
TL;DR
The paper addresses the challenge of imputing irregularly missing values in clinical multivariate time series by introducing Temporal Gaussian Copula (TGC), a four-module framework that models cross-variable and temporal dependencies through a latent Gaussian representation. It unfolds 3D clinical data into a 2D matrix, fits a Gaussian copula on latent variables, and uses an EM algorithm to iteratively estimate the covariance and impute missing values, achieving robustness to varying missing rates. Empirical results on three real-world healthcare datasets show that TGC outperforms state-of-the-art methods with strong robustness to high missingness, supported by ablation studies that confirm the importance of unfolding and the Gaussian copula. The approach holds practical significance for improving accuracy in EHR-based analyses and decision support by providing reliable imputation across diverse sampling densities.
Abstract
The imputation of the Multivariate time series (MTS) is particularly challenging since the MTS typically contains irregular patterns of missing values due to various factors such as instrument failures, interference from irrelevant data, and privacy regulations. Existing statistical methods and deep learning methods have shown promising results in time series imputation. In this paper, we propose a Temporal Gaussian Copula Model (TGC) for three-order MTS imputation. The key idea is to leverage the Gaussian Copula to explore the cross-variable and temporal relationships based on the latent Gaussian representation. Subsequently, we employ an Expectation-Maximization (EM) algorithm to improve robustness in managing data with varying missing rates. Comprehensive experiments were conducted on three real-world MTS datasets. The results demonstrate that our TGC substantially outperforms the state-of-the-art imputation methods. Additionally, the TGC model exhibits stronger robustness to the varying missing ratios in the test dataset. Our code is available at https://github.com/MVL-Lab/TGC-MTS.
