Table of Contents
Fetching ...

Deep learning empowered sensor fusion boosts infant movement classification

Tomas Kulvicius, Dajie Zhang, Luise Poustka, Sven Bölte, Lennart Jahn, Sarah Flügge, Marc Kraft, Markus Zweckstetter, Karin Nielsen-Saines, Florentin Wörgötter, Peter B Marschik

TL;DR

It is shown that the sensor fusion approach is a promising avenue for automated classification of infant motor patterns, and the development of a robust sensor fusion system may significantly enhance AI-based early recognition of neurofunctions, ultimately facilitating automated early detection of neurodevelopmental conditions.

Abstract

To assess the integrity of the developing nervous system, the Prechtl general movement assessment (GMA) is recognized for its clinical value in diagnosing neurological impairments in early infancy. GMA has been increasingly augmented through machine learning approaches intending to scale-up its application, circumvent costs in the training of human assessors and further standardize classification of spontaneous motor patterns. Available deep learning tools, all of which are based on single sensor modalities, are however still considerably inferior to that of well-trained human assessors. These approaches are hardly comparable as all models are designed, trained and evaluated on proprietary/silo-data sets. With this study we propose a sensor fusion approach for assessing fidgety movements (FMs). FMs were recorded from 51 typically developing participants. We compared three different sensor modalities (pressure, inertial, and visual sensors). Various combinations and two sensor fusion approaches (late and early fusion) for infant movement classification were tested to evaluate whether a multi-sensor system outperforms single modality assessments. Convolutional neural network (CNN) architectures were used to classify movement patterns. The performance of the three-sensor fusion (classification accuracy of 94.5%) was significantly higher than that of any single modality evaluated. We show that the sensor fusion approach is a promising avenue for automated classification of infant motor patterns. The development of a robust sensor fusion system may significantly enhance AI-based early recognition of neurofunctions, ultimately facilitating automated early detection of neurodevelopmental conditions.

Deep learning empowered sensor fusion boosts infant movement classification

TL;DR

It is shown that the sensor fusion approach is a promising avenue for automated classification of infant motor patterns, and the development of a robust sensor fusion system may significantly enhance AI-based early recognition of neurofunctions, ultimately facilitating automated early detection of neurodevelopmental conditions.

Abstract

To assess the integrity of the developing nervous system, the Prechtl general movement assessment (GMA) is recognized for its clinical value in diagnosing neurological impairments in early infancy. GMA has been increasingly augmented through machine learning approaches intending to scale-up its application, circumvent costs in the training of human assessors and further standardize classification of spontaneous motor patterns. Available deep learning tools, all of which are based on single sensor modalities, are however still considerably inferior to that of well-trained human assessors. These approaches are hardly comparable as all models are designed, trained and evaluated on proprietary/silo-data sets. With this study we propose a sensor fusion approach for assessing fidgety movements (FMs). FMs were recorded from 51 typically developing participants. We compared three different sensor modalities (pressure, inertial, and visual sensors). Various combinations and two sensor fusion approaches (late and early fusion) for infant movement classification were tested to evaluate whether a multi-sensor system outperforms single modality assessments. Convolutional neural network (CNN) architectures were used to classify movement patterns. The performance of the three-sensor fusion (classification accuracy of 94.5%) was significantly higher than that of any single modality evaluated. We show that the sensor fusion approach is a promising avenue for automated classification of infant motor patterns. The development of a robust sensor fusion system may significantly enhance AI-based early recognition of neurofunctions, ultimately facilitating automated early detection of neurodevelopmental conditions.
Paper Structure (19 sections, 15 equations, 6 figures, 8 tables)

This paper contains 19 sections, 15 equations, 6 figures, 8 tables.

Figures (6)

  • Figure 1: Flow diagram of the study pipeline. N corresponds to the number of snippets (5 $s$ data units) in each step. T1-T7 correspond to seven recording sessions in biweekly intervals, starting at 4 weeks post-term age. FM- and FM+ corresponds to the absence and presence of fidgety movements, respectively. VID -- video data, MAT -- pressure mat data, IMU -- inertial measurement unit data.
  • Figure 1: Comparison of the classification accuracies on the test sets (9-fold cross-validation) using different skeleton features ($n=9$). Only positions of the key points (Pos), positions and velocities of the key points (Pos+Vel), and positions, velocities and accelerations of the key points (Pos+Vel+Acc). We used CNN architecture with three convolutional layers (kernel parameters for each layer [numbers of kernels, filter size]: 8, 13x1; 32, 17x1; 64, 25x1) and one fully connected layer (256 units). Gray circles correspond to the classification accuracies for each fold. The box lines correspond to the lower quartile, median, and upper quartile values, and the whiskers represent ranges of the rest of the data. The notches represent a robust estimate of the uncertainty about the medians.
  • Figure 2: Flow diagrams of the feature extraction procedures for three sensor modalities.a video data, b pressure mat data, and c IMU sensor data.
  • Figure 2: Comparison of the classification accuracies on the test sets (9-fold cross-validation) using different pressure mat features ($n=9$). Only $x$, $y$, $p$ features, and $x$, $y$, $p$ features and their first derivatives $dx$, $dy$, $dp$. We used CNN architecture with three convolutional layers (kernel parameters for each layer [numbers of kernels, filter size]: 8, 13x1; 32, 17x1; 64, 25x1) and one fully connected layer (256 units). Gray circles correspond to the classification accuracies for each fold. The box lines correspond to the lower quartile, median, and upper quartile values, and the whiskers represent ranges of the rest of the data. The notches represent a robust estimate of the uncertainty about the medians.
  • Figure 3: Classification models.a Schematic diagram of the convolutional neural network (CNN) architecture. Hyperparameters for different sensor modalities are specified in Supplementary Tables 2 and 3). b, c Schematic diagrams for two sensor fusion approaches: combination of three networks trained using single sensor modalities -- late sensor fusion (b), and one network trained using all sensor modalities -- early sensor fusion (c).
  • ...and 1 more figures