Table of Contents
Fetching ...

Temporal Feature Weaving for Neonatal Echocardiographic Viewpoint Video Classification

Satchel French, Faith Zhu, Amish Jain, Naimul Khan

TL;DR

This work addresses neonatal echocardiographic viewpoint classification by recasting it as video sequence classification rather than single-image labeling. It introduces Temporal Feature Weaving (TFW), a CNN-GRU architecture that weaves per-frame CNN features across time to form a temporal-spatial signature, yielding state-of-the-art accuracy (≈93.8%) and F1 (≈93.7%) on the Neonatal Echocardiogram Dataset (NED) with 16 viewpoints. A key contribution is the professionally labeled, open-source NED dataset, alongside an architecture that maintains a modest model size (~30 million parameters) suitable for real-time smartphone deployment. The findings demonstrate that incorporating temporal dynamics significantly improves viewpoint discrimination in neonatal echocardiography and provides a practical, accessible resource to enhance screening and training in resource-limited settings.

Abstract

Automated viewpoint classification in echocardiograms can help under-resourced clinics and hospitals in providing faster diagnosis and screening when expert technicians may not be available. We propose a novel approach towards echocardiographic viewpoint classification. We show that treating viewpoint classification as video classification rather than image classification yields advantage. We propose a CNN-GRU architecture with a novel temporal feature weaving method, which leverages both spatial and temporal information to yield a 4.33\% increase in accuracy over baseline image classification while using only four consecutive frames. The proposed approach incurs minimal computational overhead. Additionally, we publish the Neonatal Echocardiogram Dataset (NED), a professionally-annotated dataset providing sixteen viewpoints and associated echocardipgraphy videos to encourage future work and development in this field. Code available at: https://github.com/satchelfrench/NED

Temporal Feature Weaving for Neonatal Echocardiographic Viewpoint Video Classification

TL;DR

This work addresses neonatal echocardiographic viewpoint classification by recasting it as video sequence classification rather than single-image labeling. It introduces Temporal Feature Weaving (TFW), a CNN-GRU architecture that weaves per-frame CNN features across time to form a temporal-spatial signature, yielding state-of-the-art accuracy (≈93.8%) and F1 (≈93.7%) on the Neonatal Echocardiogram Dataset (NED) with 16 viewpoints. A key contribution is the professionally labeled, open-source NED dataset, alongside an architecture that maintains a modest model size (~30 million parameters) suitable for real-time smartphone deployment. The findings demonstrate that incorporating temporal dynamics significantly improves viewpoint discrimination in neonatal echocardiography and provides a practical, accessible resource to enhance screening and training in resource-limited settings.

Abstract

Automated viewpoint classification in echocardiograms can help under-resourced clinics and hospitals in providing faster diagnosis and screening when expert technicians may not be available. We propose a novel approach towards echocardiographic viewpoint classification. We show that treating viewpoint classification as video classification rather than image classification yields advantage. We propose a CNN-GRU architecture with a novel temporal feature weaving method, which leverages both spatial and temporal information to yield a 4.33\% increase in accuracy over baseline image classification while using only four consecutive frames. The proposed approach incurs minimal computational overhead. Additionally, we publish the Neonatal Echocardiogram Dataset (NED), a professionally-annotated dataset providing sixteen viewpoints and associated echocardipgraphy videos to encourage future work and development in this field. Code available at: https://github.com/satchelfrench/NED
Paper Structure (19 sections, 6 equations, 15 figures, 4 tables)

This paper contains 19 sections, 6 equations, 15 figures, 4 tables.

Figures (15)

  • Figure 1: Class Balance of NED
  • Figure 2: ResNet-50-GRU-Temporal-Weave Architecture
  • Figure 3: ResNet-50-GRU-Temporal-Weave Architecture
  • Figure 4: Confusion Matrix for ResNet-GRU-TW-4-Consecutive
  • Figure 5: PR Curve for ResNet-GRU-TW-4-Consecutive
  • ...and 10 more figures