Table of Contents
Fetching ...

Deep Learning for Detecting and Early Predicting Chronic Obstructive Pulmonary Disease from Spirogram Time Series

Shuhao Mei, Xin Li, Yuxi Zhou, Jiahao Xu, Yong Zhang, Yuxuan Wan, Shan Cao, Qinghao Zhao, Shijia Geng, Junqing Xie, Shengyong Chen, Shenda Hong

TL;DR

This work addresses the need for early COPD risk assessment from spirogram time series. It introduces DeepSpiro, a four-module deep learning pipeline that stabilizes signals (SpiroSmoother), extracts robust, patch-based features (SpiroEncoder), provides model explanations via volume attention (SpiroExplainer), and forecasts future COPD risk using patch concavity patterns (SpiroPredictor). On UK Biobank data, it achieves a COPD detection AUROC of $0.8328$ and demonstrates significant predictive power for 1–5 year horizons (p value $<0.001$), outperforming a ResNet18 baseline. The approach offers interpretable risk assessments and potential for early screening, though generalizability and class-imbalance considerations warrant further validation and broader deployment. Overall, DeepSpiro advances COPD detection and long-term risk prediction by leveraging concavity features and attention-based explanations to support clinical decision-making.

Abstract

Chronic Obstructive Pulmonary Disease (COPD) is a chronic lung condition characterized by airflow obstruction. Current diagnostic methods primarily rely on identifying prominent features in spirometry (Volume-Flow time series) to detect COPD, but they are not adept at predicting future COPD risk based on subtle data patterns. In this study, we introduce a novel deep learning-based approach, DeepSpiro, aimed at the early prediction of future COPD risk. DeepSpiro consists of four key components: SpiroSmoother for stabilizing the Volume-Flow curve, SpiroEncoder for capturing volume variability-pattern through key patches of varying lengths, SpiroExplainer for integrating heterogeneous data and explaining predictions through volume attention, and SpiroPredictor for predicting the disease risk of undiagnosed high-risk patients based on key patch concavity, with prediction horizons of 1, 2, 3, 4, 5 years, or even longer. Evaluated on the UK Biobank dataset, DeepSpiro achieved an AUC of 0.8328 for COPD detection and demonstrated strong predictive performance for future COPD risk (p-value < 0.001). In summary, DeepSpiro can effectively predicts the long-term progression of the COPD disease.

Deep Learning for Detecting and Early Predicting Chronic Obstructive Pulmonary Disease from Spirogram Time Series

TL;DR

This work addresses the need for early COPD risk assessment from spirogram time series. It introduces DeepSpiro, a four-module deep learning pipeline that stabilizes signals (SpiroSmoother), extracts robust, patch-based features (SpiroEncoder), provides model explanations via volume attention (SpiroExplainer), and forecasts future COPD risk using patch concavity patterns (SpiroPredictor). On UK Biobank data, it achieves a COPD detection AUROC of and demonstrates significant predictive power for 1–5 year horizons (p value ), outperforming a ResNet18 baseline. The approach offers interpretable risk assessments and potential for early screening, though generalizability and class-imbalance considerations warrant further validation and broader deployment. Overall, DeepSpiro advances COPD detection and long-term risk prediction by leveraging concavity features and attention-based explanations to support clinical decision-making.

Abstract

Chronic Obstructive Pulmonary Disease (COPD) is a chronic lung condition characterized by airflow obstruction. Current diagnostic methods primarily rely on identifying prominent features in spirometry (Volume-Flow time series) to detect COPD, but they are not adept at predicting future COPD risk based on subtle data patterns. In this study, we introduce a novel deep learning-based approach, DeepSpiro, aimed at the early prediction of future COPD risk. DeepSpiro consists of four key components: SpiroSmoother for stabilizing the Volume-Flow curve, SpiroEncoder for capturing volume variability-pattern through key patches of varying lengths, SpiroExplainer for integrating heterogeneous data and explaining predictions through volume attention, and SpiroPredictor for predicting the disease risk of undiagnosed high-risk patients based on key patch concavity, with prediction horizons of 1, 2, 3, 4, 5 years, or even longer. Evaluated on the UK Biobank dataset, DeepSpiro achieved an AUC of 0.8328 for COPD detection and demonstrated strong predictive performance for future COPD risk (p-value < 0.001). In summary, DeepSpiro can effectively predicts the long-term progression of the COPD disease.
Paper Structure (18 sections, 15 equations, 10 figures, 4 tables)

This paper contains 18 sections, 15 equations, 10 figures, 4 tables.

Figures (10)

  • Figure 1: Our input module uses the raw Time-Volume curve time series collected from hospitals and patient demographic data, which are then passed into our AI-based model module. The AI-based model module is divided into four tasks (see Section \ref{['overview']} for details). After processing through the AI-based model module, the output data is handled by the output module. If the AI-based model diagnoses the individual as a COPD patient, we will output their diagnosis results and the interpretability figure of the model. If the AI-based model diagnoses the individual as a non-COPD patient, we will output their risk of developing COPD over the next 1-5 years.
  • Figure 2: Evaluation comparison. We compared the performance of three methods for detecting COPD: the FEV1/FVC ratio (with a threshold of 0.7), the ResNet18 model, and the DeepSpiro model. The evaluation metrics, AUROC and AUPRC, were assessed on three datasets: the full dataset, the hospitalization dataset (Only includes individuals registered as inpatients), and the death dataset (Only includes individuals registered as deceased).
  • Figure 3: The nomogram of COPD detection. The nomogram illustrates the contribution of demographic information and the FEV1/FVC diagnostic gold standard to the model's diagnostic accuracy for COPD. The nomogram allows for visual estimation of the probability of COPD diagnosis by assigning weighted scores to each variable. The “threshold” in the figure represents the cutoff value at which the predicted probability indicates a positive diagnosis for COPD. For instance, a threshold of 0.5 means that a predicted probability greater than 0.5 would be considered indicative of COPD.
  • Figure 4: (a) The left vertical axis represents the predicted probability of COPD, while the right vertical axis indicates the concavity degree based on the directed area metric for each phase. Each plot displays the mean concavity degree for the respective phase, illustrating how the concavity changes over time in relation to COPD risk. (b) This figure illustrates the probabilities of not being diagnosed with COPD over time for the high-risk and low-risk groups, as predicted by the model. The X-axis represents the time since the pulmonary function test, while the Y-axis shows the probability of not being diagnosed with COPD at each time point. Due to right censoring, not all high-risk patients are diagnosed within the observation period, resulting in probabilities that remain above zero. (c) As the onset time progresses, the concavity of the patient’s Volume-Flow curve decreases year by year.
  • Figure 5: (a) The subgroup analysis for smoking. (b) The subgroup analysis by sex. (c) The subgroup analysis by age. (d) As the onset time progresses, an individual’s concavity measure gradually decreases. Compared to non-smokers, smokers show significantly higher lung function concavity measures. (e) As the onset time progresses, an individual’s concavity measure gradually decreases. Compared to females, males show significantly higher lung function concavity measures. (f) As the onset time progresses, an individual’s concavity measure gradually decreases. Compared to younger patients, older patients show significantly higher lung function concavity measures.
  • ...and 5 more figures