Table of Contents
Fetching ...

Advancing fNIRS Neuroimaging through Synthetic Data Generation and Machine Learning Applications

Eitan Waks

TL;DR

This work addresses data scarcity in fNIRS tomography by generating large-scale synthetic data through Monte Carlo photon propagation and parametric head models. It establishes a containerized, reproducible data-analysis stack using Docker and Xarray and a cloud-based infrastructure to orchestrate scalable data generation. The synthetic datasets provide ground-truth data for supervised ML and facilitate learning of spatiotemporal photon migration patterns, with MCX outputs (.jnii and .jdat) powering ML pipelines; Monte Carlo estimates follow $E[X] ≈ \frac{1}{N} \sum x_i$ with variance $\mathrm{Var} = \sigma^2/N$. Together, these contributions aim to improve accuracy, efficiency, and generalizability of fNIRS tomography, enabling broader clinical and research applications.

Abstract

This study presents an integrated approach for advancing functional Near-Infrared Spectroscopy (fNIRS) neuroimaging through the synthesis of data and application of machine learning models. By addressing the scarcity of high-quality neuroimaging datasets, this work harnesses Monte Carlo simulations and parametric head models to generate a comprehensive synthetic dataset, reflecting a wide spectrum of conditions. We developed a containerized environment employing Docker and Xarray for standardized and reproducible data analysis, facilitating meaningful comparisons across different signal processing modalities. Additionally, a cloud-based infrastructure is established for scalable data generation and processing, enhancing the accessibility and quality of neuroimaging data. The combination of synthetic data generation with machine learning techniques holds promise for improving the accuracy, efficiency, and applicability of fNIRS tomography, potentially revolutionizing diagnostics and treatment strategies for neurological conditions. The methodologies and infrastructure developed herein set new standards in data simulation and analysis, paving the way for future research in neuroimaging and the broader biomedical engineering field.

Advancing fNIRS Neuroimaging through Synthetic Data Generation and Machine Learning Applications

TL;DR

This work addresses data scarcity in fNIRS tomography by generating large-scale synthetic data through Monte Carlo photon propagation and parametric head models. It establishes a containerized, reproducible data-analysis stack using Docker and Xarray and a cloud-based infrastructure to orchestrate scalable data generation. The synthetic datasets provide ground-truth data for supervised ML and facilitate learning of spatiotemporal photon migration patterns, with MCX outputs (.jnii and .jdat) powering ML pipelines; Monte Carlo estimates follow with variance . Together, these contributions aim to improve accuracy, efficiency, and generalizability of fNIRS tomography, enabling broader clinical and research applications.

Abstract

This study presents an integrated approach for advancing functional Near-Infrared Spectroscopy (fNIRS) neuroimaging through the synthesis of data and application of machine learning models. By addressing the scarcity of high-quality neuroimaging datasets, this work harnesses Monte Carlo simulations and parametric head models to generate a comprehensive synthetic dataset, reflecting a wide spectrum of conditions. We developed a containerized environment employing Docker and Xarray for standardized and reproducible data analysis, facilitating meaningful comparisons across different signal processing modalities. Additionally, a cloud-based infrastructure is established for scalable data generation and processing, enhancing the accessibility and quality of neuroimaging data. The combination of synthetic data generation with machine learning techniques holds promise for improving the accuracy, efficiency, and applicability of fNIRS tomography, potentially revolutionizing diagnostics and treatment strategies for neurological conditions. The methodologies and infrastructure developed herein set new standards in data simulation and analysis, paving the way for future research in neuroimaging and the broader biomedical engineering field.
Paper Structure (24 sections, 13 equations, 12 figures, 8 tables, 1 algorithm)

This paper contains 24 sections, 13 equations, 12 figures, 8 tables, 1 algorithm.

Figures (12)

  • Figure 1: Detector positions displayed in 2-D. The detectors are embedded within a strip that is placed on the rear of the subject's head. The occipital pole is at the coincidence of the symmetry lines which are represented as red dotted lines.
  • Figure 2: Source positions on the strip placed on the subjects head, displayed in 2-D. The red dotted lines represent symmetry lines. The coincidence of the symmetry lines align with the occipital pole.
  • Figure 3: Sensor and detector positions displayed in 3-D. Detectors are represented as blue dots. Sensors are represented as red dots. The z values were estimated using Bushby et al.'s model for estimating adult head circumference. The X plane is the plane parallel to the mid sagittal plane 9.25 mm lateral to detector 0 when the strip of detectors is embedded within the coronal plane. The Y plane is the plane parallel to the transverse plane 9.25 mm inferior of detector 0. The Z plane is the plane parallel to the coronal plane and coincident with the occipital pole.
  • Figure 4: A red-blue heat map depicting the difference in 2-D vs 3-D Euclidean (L2 Norm) distance between sources and detectors. This is a normalized heat map where red is is positive displacement and blue is negative. The differences increase as NN increases.
  • Figure 5: The probability of a detector sensing a photon from a particular source is a negative function of SDS. The complexity and probability of photon paths is schematically represented. Overlapping paths may be used for blind tomography. Schematic photon propagation for source detector pairs with a maximum NN (SDS) value of 7 for detectors 5, 11, 12, 17 is displayed as green lines. The green lines are L2 norm distances where saturation decreases with increasing NN values.
  • ...and 7 more figures