How You Split Matters: Data Leakage and Subject Characteristics Studies in Longitudinal Brain MRI Analysis

Dewinda Julianensi Rumala

How You Split Matters: Data Leakage and Subject Characteristics Studies in Longitudinal Brain MRI Analysis

Dewinda Julianensi Rumala

TL;DR

This study interrogates data leakage in longitudinal brain MRI analysis with 3D CNNs, comparing subject-wise, record-wise, and late-wise data splits. Using ADNI-derived T1/T2 MRIs and Grad-CAM, it demonstrates that record-wise and late-wise splits yield inflated cross-validation performance due to identity confounding, whereas subject-wise splitting with hold-out evaluation provides more reliable generalization. The findings underscore the importance of early, subject-wise data partitioning and external validation to ensure robustness in longitudinal MRI classification tasks such as Alzheimer's disease analysis. The work offers practical guidance for evaluating deep learning models in medical imaging and highlights the need for larger, balanced datasets to mitigate under-fitting and demographic biases.

Abstract

Deep learning models have revolutionized the field of medical image analysis, offering significant promise for improved diagnostics and patient care. However, their performance can be misleadingly optimistic due to a hidden pitfall called 'data leakage'. In this study, we investigate data leakage in 3D medical imaging, specifically using 3D Convolutional Neural Networks (CNNs) for brain MRI analysis. While 3D CNNs appear less prone to leakage than 2D counterparts, improper data splitting during cross-validation (CV) can still pose issues, especially with longitudinal imaging data containing repeated scans from the same subject. We explore the impact of different data splitting strategies on model performance for longitudinal brain MRI analysis and identify potential data leakage concerns. GradCAM visualization helps reveal shortcuts in CNN models caused by identity confounding, where the model learns to identify subjects along with diagnostic features. Our findings, consistent with prior research, underscore the importance of subject-wise splitting and evaluating our model further on hold-out data from different subjects to ensure the integrity and reliability of deep learning models in medical image analysis.

How You Split Matters: Data Leakage and Subject Characteristics Studies in Longitudinal Brain MRI Analysis

TL;DR

Abstract

Paper Structure (15 sections, 2 figures, 4 tables)

This paper contains 15 sections, 2 figures, 4 tables.

Introduction
Related work.
Methods
Data Collection and Processing
Training Setup
Evaluation Scheme
Subject-Wise Split.
Record-Wise Split.
Late Split.
Result
Evaluation Results on 5-Fold Data.
Further Evaluation on More Subjects.
Grad-CAM Visualization.
Discussion and Conclusion
Acknowledgements

Figures (2)

Figure 1: GradCAM visualization examples for Normal Control (CN) and Alzheimer's Disease (AD) classes under different data split schemes on axial slices of T1-weighted MRI. Each pair of rows depicts (i) correctly classified images, and (ii) misclassified images across all splitting strategies
Figure S1: A toy example of different data split strategies for longitudinal brain MRI. (a) Subject-wise splitting groups all image scans based on the subjects into k-folds. (b) Record-wise splitting groups image scans based on different visit times into k-folds. (c) Late-wise splitting groups image scans based on transformation technique into k-folds.

How You Split Matters: Data Leakage and Subject Characteristics Studies in Longitudinal Brain MRI Analysis

TL;DR

Abstract

How You Split Matters: Data Leakage and Subject Characteristics Studies in Longitudinal Brain MRI Analysis

Authors

TL;DR

Abstract

Table of Contents

Figures (2)