Clinical information extraction for Low-resource languages with Few-shot learning using Pre-trained language models and Prompting
Phillip Richter-Pechanski, Philipp Wiesenbach, Dominic M. Schwab, Christina Kiriakou, Nicolas Geis, Christoph Dieterich, Anette Frank
TL;DR
This study tackles the challenge of extracting clinical information from German doctor’s letters in low-resource settings by evaluating few-shot learning via Pattern-Exploiting Training (PET) with domain- and task-adapted pretrained language models. It systematically compares PET against supervised baselines across multiple pretraining schemes and six shot sizes, using SHAP for token-level interpretability. The results show that a domain- and task-adapted gbert-base-comb PET model with contextual enrichment substantially improves accuracy, achieving up to 30.5 percentage points better performance than a full-data supervised model at 20 shots, while maintaining interpretability and on-premise suitability. The paper provides practical recommendations for deploying clinical information extraction in low-resource languages, emphasizing pretraining data quality, context, smaller models, and SHAP-based explanations to support trustworthy decisions.
Abstract
Automatic extraction of medical information from clinical documents poses several challenges: high costs of required clinical expertise, limited interpretability of model predictions, restricted computational resources and privacy regulations. Recent advances in domain-adaptation and prompting methods showed promising results with minimal training data using lightweight masked language models, which are suited for well-established interpretability methods. We are first to present a systematic evaluation of these methods in a low-resource setting, by performing multi-class section classification on German doctor's letters. We conduct extensive class-wise evaluations supported by Shapley values, to validate the quality of our small training data set and to ensure the interpretability of model predictions. We demonstrate that a lightweight, domain-adapted pretrained model, prompted with just 20 shots, outperforms a traditional classification model by 30.5% accuracy. Our results serve as a process-oriented guideline for clinical information extraction projects working with low-resource.
