The Impact of an XAI-Augmented Approach on Binary Classification with Scarce Data
Ximing Wen, Rosina O. Weber, Anik Sen, Darryl Hannan, Steven C. Nesbit, Vincent Chan, Alberto Goffi, Michael Morris, John C. Hunninghake, Nicholas E. Villalobos, Edward Kim, Christopher J. MacLellan
TL;DR
The paper tackles the problem of binary classification with scarce positive ultrasound data by introducing an Explainable AI-Augmented (XAIAUG) training framework that adds a differentiable attribution-based prior, derived from Gradient SHAP, to cross-entropy loss. This prior encourages the model to align its predictions with interpretable feature attributions, forming an additive explanation model $g(z)=\phi_{0}+\sum_{j=1}^{d}\phi_{j}(z_{j})$ and an objective $\mathcal{L}_{XAIAUG}(\theta)=\mathcal{L}(\theta;X,Y)+\lambda\Phi(X,Y)$. The approach is evaluated on three ultrasound datasets (PTX, ONSD, COVID-19) with 5-fold cross-validation, showing consistent improvements in Balanced Accuracy and F1, improved local accuracy, and sensitivity to data amount, with reductions in overfitting when compared to regularization baselines. The results suggest XAIAUG improves reliability of data-scarce ultrasound classifiers and provides better interpretability alignment, though the gains diminish as data scale increases, indicating practical usefulness primarily in scarce-data regimes. Limitations include not benchmarking against all scarcity strategies, with future work proposing synthetic data, incremental learning, and broader comparisons to other scarcity-reduction techniques.
Abstract
Point-of-Care Ultrasound (POCUS) is the practice of clinicians conducting and interpreting ultrasound scans right at the patient's bedside. However, the expertise needed to interpret these images is considerable and may not always be present in emergency situations. This reality makes algorithms such as machine learning classifiers extremely valuable to augment human decisions. POCUS devices are becoming available at a reasonable cost in the size of a mobile phone. The challenge of turning POCUS devices into life-saving tools is that interpretation of ultrasound images requires specialist training and experience. Unfortunately, the difficulty to obtain positive training images represents an important obstacle to building efficient and accurate classifiers. Hence, the problem we try to investigate is how to explore strategies to increase accuracy of classifiers trained with scarce data. We hypothesize that training with a few data instances may not suffice for classifiers to generalize causing them to overfit. Our approach uses an Explainable AI-Augmented approach to help the algorithm learn more from less and potentially help the classifier better generalize.
