Leveraging Machine Learning Models to Predict the Outcome of Digital Medical Triage Interviews
Sofia Krylova, Fabian Schmidt, Vladimir Vlassov
TL;DR
The paper tackles predicting triage outcomes from unfinished digital interviews by framing it as a multiclass classification on sparse, high-dimensional questionnaire data from Platform24's Triage24. It compares five tree-based models and the TabTransformer, finding that LGBMClassifier and CatBoostClassifier achieve robust accuracy (>80%) on complete interviews and retain strong performance as interviews become incomplete, with TabTransformer offering stable accuracy at the cost of long training times. A key result is the linear relationship between interview completeness and predictive power for most models, while TabTransformer remains resilient to missing data. The work demonstrates practical feasibility for integrating an ML side-car into deterministic digital triage to aid patients who exit interviews prematurely and highlights concrete directions for scalable, interpretable, and resource-aware deployment in healthcare settings.
Abstract
Many existing digital triage systems are questionnaire-based, guiding patients to appropriate care levels based on information (e.g., symptoms, medical history, and urgency) provided by the patients answering questionnaires. Such a system often uses a deterministic model with predefined rules to determine care levels. It faces challenges with incomplete triage interviews since it can only assist patients who finish the process. In this study, we explore the use of machine learning (ML) to predict outcomes of unfinished interviews, aiming to enhance patient care and service quality. Predicting triage outcomes from incomplete data is crucial for patient safety and healthcare efficiency. Our findings show that decision-tree models, particularly LGBMClassifier and CatBoostClassifier, achieve over 80\% accuracy in predicting outcomes from complete interviews while having a linear correlation between the prediction accuracy and interview completeness degree. For example, LGBMClassifier achieves 88,2\% prediction accuracy for interviews with 100\% completeness, 79,6\% accuracy for interviews with 80\% completeness, 58,9\% accuracy for 60\% completeness, and 45,7\% accuracy for 40\% completeness. The TabTransformer model demonstrated exceptional accuracy of over 80\% for all degrees of completeness but required extensive training time, indicating a need for more powerful computational resources. The study highlights the linear correlation between interview completeness and predictive power of the decision-tree models.
