Table of Contents
Fetching ...

Machine Learning Based Prediction of Surgical Outcomes in Chronic Rhinosinusitis from Clinical Data

Sayeed Shafayet Chowdhury, Karen D'Souza, V. Siva Kakumani, Snehasis Mukhopadhyay, Shiaofen Fang, Rodney J. Schlosser, Daniel M. Beswick, Jeremiah A. Alt, Jess C. Mace, Zachary M. Soler, Timothy L. Smith, Vijay R. Ramakrishnan

Abstract

Artificial intelligence (AI) has increasingly transformed medical prognostics by enabling rapid and accurate analysis across imaging and pathology. However, the investigation of machine learning predictions applied to prospectively collected, standardized data from observational clinical intervention trials remains underexplored, despite its potential to reduce costs and improve patient outcomes. Chronic rhinosinusitis (CRS), a persistent inflammatory disease of the paranasal sinuses lasting more than three months, imposes a substantial burden on quality of life (QoL) and societal cost. Although many patients respond to medical therapy, others with refractory symptoms often pursue surgical intervention. Surgical decision-making in CRS is complex, as it must weigh known procedural risks against uncertain individualized outcomes. In this study, we evaluated supervised machine learning models for predicting surgical benefit in CRS, using the Sino-Nasal Outcome Test-22 (SNOT-22) as the primary patient-reported outcome. Our prospectively collected cohort from an observational intervention trial comprised patients who all underwent surgery; we investigated whether models trained only on preoperative data could identify patients who might not have been recommended surgery prior to the procedure. Across multiple algorithms, including an ensemble approach, our best model achieved approximately 85% classification accuracy, providing accurate and interpretable predictions of surgical candidacy. Moreover, on a held-out set of 30 cases spanning mixed difficulty, our model achieved 80% accuracy, exceeding the average prediction accuracy of expert clinicians (75.6%), demonstrating its potential to augment clinical decision-making and support personalized CRS care.

Machine Learning Based Prediction of Surgical Outcomes in Chronic Rhinosinusitis from Clinical Data

Abstract

Artificial intelligence (AI) has increasingly transformed medical prognostics by enabling rapid and accurate analysis across imaging and pathology. However, the investigation of machine learning predictions applied to prospectively collected, standardized data from observational clinical intervention trials remains underexplored, despite its potential to reduce costs and improve patient outcomes. Chronic rhinosinusitis (CRS), a persistent inflammatory disease of the paranasal sinuses lasting more than three months, imposes a substantial burden on quality of life (QoL) and societal cost. Although many patients respond to medical therapy, others with refractory symptoms often pursue surgical intervention. Surgical decision-making in CRS is complex, as it must weigh known procedural risks against uncertain individualized outcomes. In this study, we evaluated supervised machine learning models for predicting surgical benefit in CRS, using the Sino-Nasal Outcome Test-22 (SNOT-22) as the primary patient-reported outcome. Our prospectively collected cohort from an observational intervention trial comprised patients who all underwent surgery; we investigated whether models trained only on preoperative data could identify patients who might not have been recommended surgery prior to the procedure. Across multiple algorithms, including an ensemble approach, our best model achieved approximately 85% classification accuracy, providing accurate and interpretable predictions of surgical candidacy. Moreover, on a held-out set of 30 cases spanning mixed difficulty, our model achieved 80% accuracy, exceeding the average prediction accuracy of expert clinicians (75.6%), demonstrating its potential to augment clinical decision-making and support personalized CRS care.
Paper Structure (56 sections, 13 equations, 19 figures, 4 tables)

This paper contains 56 sections, 13 equations, 19 figures, 4 tables.

Figures (19)

  • Figure 1: Classic local manifestations of Chronic Rhinosinusitis (CRS). In addition, systemic such as fatigue and depression, and financial burden of direct and indirect costs, together result in a large overall disease burden (Figure reused with permission from rudmik2015medical).
  • Figure 2: Schematic pipeline of the proposed decision-support approach. Structured fields from the electronic health record (EHR) undergo data cleaning and encoding, after which multiple in-house classifiers (e.g., MLP, logistic regression, SVM) are trained. Model outputs are combined via a majority-vote ensemble to generate a binary surgery recommendation (Yes/No).
  • Figure 3: Ensemble decision-support pipeline. Structured EHR data are cleaned and encoded, then passed in parallel to multiple base learners (Random Forest, Logistic Regression, SVM, MLP etc.). Their predictions are aggregated via a voting scheme to produce a binary surgery recommendation (Yes/No). Only preoperative variables are used for inference.
  • Figure 4: Class distribution in the train (left) and test (right) sets. Bars show the number of CRS patients who obtained the desired surgery outcome (1) versus those who did not obtain the desired outcome (0). The splits retain the overall class imbalance (train: 338 vs. 81; test: 85 vs. 20), which is considered in model training and evaluation.
  • Figure 5: Exploratory feature and PCA analysis. (a) Explained variance by principal component. Each bar shows the proportion of total variance attributable to an individual component after standardization; early components carry the largest share, followed by a long tail. (b) Cumulative explained variance. The running total of variance captured as components are added; the dashed line marks 95% retained variance, indicating that a moderate subset of components suffices for compact representations. (c) Feature correlation heatmap (Pearson’s correlation coefficient, $r$). Blue denotes negative and red positive associations among preprocessed clinical predictors; the matrix helps flag redundancy/collinearity before modeling. (d) PCA score plot (PC1 vs. PC2). Patients are projected onto the first two principal components and colored by class label; the overlap across classes in this linear 2D view suggests that discrimination likely depends on higher-order components and/or non-linear structure.
  • ...and 14 more figures