Table of Contents
Fetching ...

SWoTTeD: An Extension of Tensor Decomposition to Temporal Phenotyping

Hana Sebia, Thomas Guyet, Etienne Audureau

TL;DR

SWoTTeD extends tensor decomposition to temporal phenotyping by modeling temporal patterns as latent phenotypes arranged over a fixed window and learned via convolution-based reconstruction. The method jointly learns a phenotype tensor $\\mathcal{P}$ and patient-specific pathways $\\bm{W}^{(k)}$ under a Bernoulli loss with sparsity and non-succession regularizers, providing interpretable temporal patterns. Empirical results show superior reconstruction and meaningful phenotypes on synthetic data and competitive performance on real EHR datasets, with a real-world ICU COVID-19 case study illustrating clinical relevance. The work introduces an open-source implementation and points to future improvements such as variable window sizes and solver refinements to enhance stability and applicability.

Abstract

Tensor decomposition has recently been gaining attention in the machine learning community for the analysis of individual traces, such as Electronic Health Records (EHR). However, this task becomes significantly more difficult when the data follows complex temporal patterns. This paper introduces the notion of a temporal phenotype as an arrangement of features over time and it proposes SWoTTeD (Sliding Window for Temporal Tensor Decomposition), a novel method to discover hidden temporal patterns. SWoTTeD integrates several constraints and regularizations to enhance the interpretability of the extracted phenotypes. We validate our proposal using both synthetic and real-world datasets, and we present an original usecase using data from the Greater Paris University Hospital. The results show that SWoTTeD achieves at least as accurate reconstruction as recent state-of-the-art tensor decomposition models, and extracts temporal phenotypes that are meaningful for clinicians.

SWoTTeD: An Extension of Tensor Decomposition to Temporal Phenotyping

TL;DR

SWoTTeD extends tensor decomposition to temporal phenotyping by modeling temporal patterns as latent phenotypes arranged over a fixed window and learned via convolution-based reconstruction. The method jointly learns a phenotype tensor and patient-specific pathways under a Bernoulli loss with sparsity and non-succession regularizers, providing interpretable temporal patterns. Empirical results show superior reconstruction and meaningful phenotypes on synthetic data and competitive performance on real EHR datasets, with a real-world ICU COVID-19 case study illustrating clinical relevance. The work introduces an open-source implementation and points to future improvements such as variable window sizes and solver refinements to enhance stability and applicability.

Abstract

Tensor decomposition has recently been gaining attention in the machine learning community for the analysis of individual traces, such as Electronic Health Records (EHR). However, this task becomes significantly more difficult when the data follows complex temporal patterns. This paper introduces the notion of a temporal phenotype as an arrangement of features over time and it proposes SWoTTeD (Sliding Window for Temporal Tensor Decomposition), a novel method to discover hidden temporal patterns. SWoTTeD integrates several constraints and regularizations to enhance the interpretability of the extracted phenotypes. We validate our proposal using both synthetic and real-world datasets, and we present an original usecase using data from the Greater Paris University Hospital. The results show that SWoTTeD achieves at least as accurate reconstruction as recent state-of-the-art tensor decomposition models, and extracts temporal phenotypes that are meaningful for clinicians.
Paper Structure (47 sections, 12 equations, 21 figures, 6 tables, 1 algorithm)

This paper contains 47 sections, 12 equations, 21 figures, 6 tables, 1 algorithm.

Figures (21)

  • Figure 1: Illustration of an irregular tensor $\mathcal{X}=\{\bm{X}^{(k)}\}_{k\in[K]}$ representing a collection of $K$ patients stays. Each patient has its own duration $T_k$ but share the same set of cares (in rows). A black cell at position $(i,t)$ (i.e.$x^{(k)}_{i,t}= 1$) indicates that the $i$-th care occurs at the time $t$.
  • Figure 2: Illustration of a matrix reconstruction ($\bm{X}^{(k)}$) from $R=3$ phenotypes of size $\omega=2$ on the left and a care pathway ($\bm{W}^{(k)}$) on the top. Each phenotype has a specific color. Each colored cell in $\bm{W}^{(k)}$ designates the start of a phenotype occurrence in the reconstruction (surrounded with a colored rectangle in $\bm{X}^{(k)}$). A cell with two colors received the contribution of two occurrences of different phenotypes.
  • Figure 3: Example of alternative decompositions of a sequence of similar events with the same $\mathcal{L}^{\mathsmaller \circledast}$ value. Phenotype 1 does not capture the sequence of events, whereas phenotype 2 does. The information is reported in the pathway in the case of phenotype 1.
  • Figure 4: $FIT_P$ (left) and $FIT_X$ (right) of SWoTTeD with $\omega=3$ on synthetic data. Each graph represents a box plot for 10 runs.
  • Figure 5: Comparison of $FIT_X$ (left) and $FIT_P$ (right) with respect to the $\beta$ hyper-parameter on a synthetic dataset with hidden phenotypes having repeated successive events.
  • ...and 16 more figures