Semi-Supervised Learning for Deep Causal Generative Models
Yasin Ibrahim, Hermione Warr, Konstantinos Kamnitsas
TL;DR
This work tackles counterfactual reasoning in medical imaging under missing labels by introducing a semi-supervised deep causal generative framework that merges a hierarchical VAE with predictive components for causal variables $y_C$ and $y_E$. It leverages ELBO-based losses for fully labelled, unlabelled, and partially labelled data, and adds counterfactual regularisation via do-interventions to enforce causal consistency, together with an invertible abduction-action-prediction scheme for counterfactual generation. The approach is demonstrated on Colour Morpho-MNIST and MIMIC-CXR, showing improved counterfactual accuracy, robustness to incomplete labels, and the ability to learn causal relationships under data scarcity, while analyzing the independence of cause and mechanism through label availability. A limitation is the assumption of a known DAG; future work may address DAG misspecification and learning causal structure from data, with potential to augment underrepresented populations in clinical datasets.”
Abstract
Developing models that are capable of answering questions of the form "How would x change if y had been z?'" is fundamental to advancing medical image analysis. Training causal generative models that address such counterfactual questions, though, currently requires that all relevant variables have been observed and that the corresponding labels are available in the training data. However, clinical data may not have complete records for all patients and state of the art causal generative models are unable to take full advantage of this. We thus develop, for the first time, a semi-supervised deep causal generative model that exploits the causal relationships between variables to maximise the use of all available data. We explore this in the setting where each sample is either fully labelled or fully unlabelled, as well as the more clinically realistic case of having different labels missing for each sample. We leverage techniques from causal inference to infer missing values and subsequently generate realistic counterfactuals, even for samples with incomplete labels.
