Efficient Systematic Reviews: Literature Filtering with Transformers & Transfer Learning
John Hawkins, David Tivey
TL;DR
The paper tackles the bottleneck of abstract screening in biomedical systematic reviews by proposing a general-purpose filter trained on multiple research questions. It leverages domain-adapted transformers (BioBERT/BlueBERT) and transfer learning, exploring standard text matching, PICO-based representations, and fine-tuned inclusion predictors, evaluated under Leave-One-Question-Out validation to simulate novel questions. Results show that pre-trained transformers, especially when fine-tuned, improve exclusion efficiency and ranking quality, though no single approach dominates all questions; simpler linear models can excel for top-50 inclusions, depending on the use case. The work demonstrates the potential of scalable, question-agnostic screening in evidence synthesis and informs design choices for future automated screening tools, including when to favor PICO framing or larger models.
Abstract
Identifying critical research within the growing body of academic work is an intrinsic aspect of conducting quality research. Systematic review processes used in evidence-based medicine formalise this as a procedure that must be followed in a research program. However, it comes with an increasing burden in terms of the time required to identify the important articles of research for a given topic. In this work, we develop a method for building a general-purpose filtering system that matches a research question, posed as a natural language description of the required content, against a candidate set of articles obtained via the application of broad search terms. Our results demonstrate that transformer models, pre-trained on biomedical literature, and then fine tuned for the specific task, offer a promising solution to this problem. The model can remove large volumes of irrelevant articles for most research questions. Furthermore, analysis of the specific research questions in our training data suggest natural avenues for further improvement.
