Table of Contents
Fetching ...

Efficient Systematic Reviews: Literature Filtering with Transformers & Transfer Learning

John Hawkins, David Tivey

TL;DR

The paper tackles the bottleneck of abstract screening in biomedical systematic reviews by proposing a general-purpose filter trained on multiple research questions. It leverages domain-adapted transformers (BioBERT/BlueBERT) and transfer learning, exploring standard text matching, PICO-based representations, and fine-tuned inclusion predictors, evaluated under Leave-One-Question-Out validation to simulate novel questions. Results show that pre-trained transformers, especially when fine-tuned, improve exclusion efficiency and ranking quality, though no single approach dominates all questions; simpler linear models can excel for top-50 inclusions, depending on the use case. The work demonstrates the potential of scalable, question-agnostic screening in evidence synthesis and informs design choices for future automated screening tools, including when to favor PICO framing or larger models.

Abstract

Identifying critical research within the growing body of academic work is an intrinsic aspect of conducting quality research. Systematic review processes used in evidence-based medicine formalise this as a procedure that must be followed in a research program. However, it comes with an increasing burden in terms of the time required to identify the important articles of research for a given topic. In this work, we develop a method for building a general-purpose filtering system that matches a research question, posed as a natural language description of the required content, against a candidate set of articles obtained via the application of broad search terms. Our results demonstrate that transformer models, pre-trained on biomedical literature, and then fine tuned for the specific task, offer a promising solution to this problem. The model can remove large volumes of irrelevant articles for most research questions. Furthermore, analysis of the specific research questions in our training data suggest natural avenues for further improvement.

Efficient Systematic Reviews: Literature Filtering with Transformers & Transfer Learning

TL;DR

The paper tackles the bottleneck of abstract screening in biomedical systematic reviews by proposing a general-purpose filter trained on multiple research questions. It leverages domain-adapted transformers (BioBERT/BlueBERT) and transfer learning, exploring standard text matching, PICO-based representations, and fine-tuned inclusion predictors, evaluated under Leave-One-Question-Out validation to simulate novel questions. Results show that pre-trained transformers, especially when fine-tuned, improve exclusion efficiency and ranking quality, though no single approach dominates all questions; simpler linear models can excel for top-50 inclusions, depending on the use case. The work demonstrates the potential of scalable, question-agnostic screening in evidence synthesis and informs design choices for future automated screening tools, including when to favor PICO framing or larger models.

Abstract

Identifying critical research within the growing body of academic work is an intrinsic aspect of conducting quality research. Systematic review processes used in evidence-based medicine formalise this as a procedure that must be followed in a research program. However, it comes with an increasing burden in terms of the time required to identify the important articles of research for a given topic. In this work, we develop a method for building a general-purpose filtering system that matches a research question, posed as a natural language description of the required content, against a candidate set of articles obtained via the application of broad search terms. Our results demonstrate that transformer models, pre-trained on biomedical literature, and then fine tuned for the specific task, offer a promising solution to this problem. The model can remove large volumes of irrelevant articles for most research questions. Furthermore, analysis of the specific research questions in our training data suggest natural avenues for further improvement.
Paper Structure (12 sections, 2 equations, 2 figures, 8 tables)

This paper contains 12 sections, 2 equations, 2 figures, 8 tables.

Figures (2)

  • Figure 1: Schematic representation of the standard machine learning approach to building a systematic review filter tool. 1. Research question is proposed, 2. Database search is conducted, 3. List of candidate articles is compiled, 4. Human annotators review a small selection of articles, 5. Small annotated collection is used to train a machine learning filter, 6. The machine learning is applied to the remaining documents, 7. Producing the final set for complete review by human experts.
  • Figure 2: Schematic representation of the general systematic review filter process. The first five steps define how the general filter is trained, the last two the application to new research questions. 1. A set of training research question are taken, 2. Database searches are conducted for each, 3. Lists of candidate articles are compiled, 4. Human annotators review these training questions to create an annotated set, 5. The general filter is trained. 6. The machine learning is applied to filter documents for a new query, 7. Producing the final set for complete review by human experts.