Spectral structure learning for clinical time series
Ivan Lerner, Anita Burgun, Francis Bach
TL;DR
The paper develops StructGP, a structured multi-task Gaussian process for learning dependency graphs among irregularly sampled clinical time series across multiple patients. It encodes ordered conditional relations via a sparse impulse-response structure and learns a DAG using a differentiable acyclicity constraint implemented through an augmented Lagrangian framework with proximal gradient updates, combined with grid-search for sparsity selection. Through toy and larger simulation studies, the approach demonstrates capacity to recover the true graph structure with high recall and reasonable precision, while highlighting the importance of the regularization path for graph identification. The work advances scalable, data-driven discovery of temporal dependencies in EHR time series and discusses practical considerations such as standardization sensitivity and potential GPU-enabled scaling.
Abstract
We develop and evaluate a structure learning algorithm for clinical time series. Clinical time series are multivariate time series observed in multiple patients and irregularly sampled, challenging existing structure learning algorithms. We assume that our times series are realizations of StructGP, a k-dimensional multi-output or multi-task stationary Gaussian process (GP), with independent patients sharing the same covariance function. StructGP encodes ordered conditional relations between time series, represented in a directed acyclic graph. We implement an adapted NOTEARS algorithm, which based on a differentiable definition of acyclicity, recovers the graph by solving a series of continuous optimization problems. Simulation results show that up to mean degree 3 and 20 tasks, we reach a median recall of 0.93% [IQR, 0.86, 0.97] while keeping a median precision of 0.71% [0.57-0.84], for recovering directed edges. We further show that the regularization path is key to identifying the graph. With StructGP, we proposed a model of time series dependencies, that flexibly adapt to different time series regularity, while enabling us to learn these dependencies from observations.
