HuLP: Human-in-the-Loop for Prognosis
Muhammad Ridzuan, Mai Kassem, Numan Saeed, Ikboljon Sobirov, Mohammad Yaqub
TL;DR
HuLP addresses prognosis under missing data and clinician-intervention gaps by introducing a four-component architecture (encoder, intervention block, classifier, prognosticator) and a dual-loss framework that combines $\mathcal{L}_1$ for concept alignment with $\mathcal{L}_2$ for prognosis, yielding $\mathcal{L}_{final} = b\mathcal{L}_1 + (1 - b)\mathcal{L}_2$. It enables test-time human intervention by allowing clinicians to override concept activations, and simulates this during training with a 0.25 probability of exposing ground-truth concepts. Evaluations on CHAIMELEON and HECKTOR show HuLP is competitive with state-of-the-art baselines and can gain up to ~0.1 in the time-dependent C-index with test-time interventions, particularly on CHAIMELEON, while demonstrating robustness to missing data. The work highlights improved interpretability and alignment with clinical workflows by integrating EHR-derived concepts with imaging, suggesting practical impact for reliable prognosis in oncology and guiding future clinical validation and disentangled feature usability.
Abstract
This paper introduces HuLP, a Human-in-the-Loop for Prognosis model designed to enhance the reliability and interpretability of prognostic models in clinical contexts, especially when faced with the complexities of missing covariates and outcomes. HuLP offers an innovative approach that enables human expert intervention, empowering clinicians to interact with and correct models' predictions, thus fostering collaboration between humans and AI models to produce more accurate prognosis. Additionally, HuLP addresses the challenges of missing data by utilizing neural networks and providing a tailored methodology that effectively handles missing data. Traditional methods often struggle to capture the nuanced variations within patient populations, leading to compromised prognostic predictions. HuLP imputes missing covariates based on imaging features, aligning more closely with clinician workflows and enhancing reliability. We conduct our experiments on two real-world, publicly available medical datasets to demonstrate the superiority and competitiveness of HuLP.
