Table of Contents
Fetching ...

Biomedical Question Answering: A Survey of Approaches and Challenges

Qiao Jin, Zheng Yuan, Guangzhi Xiong, Qianlan Yu, Huaiyuan Ying, Chuanqi Tan, Mosha Chen, Songfang Huang, Xiaozhong Liu, Sheng Yu

TL;DR

This survey maps the landscape of Biomedical Question Answering across five main paradigms—classic, information retrieval, machine reading comprehension, knowledge base, and question entailment—anchored by large biomedical datasets such as BioASQ and BioASQ-driven MRC benchmarks. It details representative datasets and methods in each paradigm, highlights the pivotal role of domain-specific pretrained language models (e.g., BioBERT, SciBERT, PubMedBERT), and discusses how modern approaches increasingly fuse multiple resources to leverage broader biomedical knowledge. The authors identify key challenges—data scale and quality, underutilization of domain knowledge, explainability, evaluation, and fairness—and propose future directions including dataset generation, cross-paradigm integration, and richer reasoning capabilities (e.g., multi-hop and numeric reasoning) to advance practical BQA systems. Overall, the paper emphasizes that while substantial methodological gains have been achieved, bridging the gap to real-world biomedical decision support requires scalable, explainable, and knowledge-integrated QA solutions. The work serves as a comprehensive guide for researchers and practitioners navigating the evolving BQA landscape, and it highlights concrete avenues for building more capable, trustworthy biomedical QA systems.

Abstract

Automatic Question Answering (QA) has been successfully applied in various domains such as search engines and chatbots. Biomedical QA (BQA), as an emerging QA task, enables innovative applications to effectively perceive, access and understand complex biomedical knowledge. There have been tremendous developments of BQA in the past two decades, which we classify into 5 distinctive approaches: classic, information retrieval, machine reading comprehension, knowledge base and question entailment approaches. In this survey, we introduce available datasets and representative methods of each BQA approach in detail. Despite the developments, BQA systems are still immature and rarely used in real-life settings. We identify and characterize several key challenges in BQA that might lead to this issue, and discuss some potential future directions to explore.

Biomedical Question Answering: A Survey of Approaches and Challenges

TL;DR

This survey maps the landscape of Biomedical Question Answering across five main paradigms—classic, information retrieval, machine reading comprehension, knowledge base, and question entailment—anchored by large biomedical datasets such as BioASQ and BioASQ-driven MRC benchmarks. It details representative datasets and methods in each paradigm, highlights the pivotal role of domain-specific pretrained language models (e.g., BioBERT, SciBERT, PubMedBERT), and discusses how modern approaches increasingly fuse multiple resources to leverage broader biomedical knowledge. The authors identify key challenges—data scale and quality, underutilization of domain knowledge, explainability, evaluation, and fairness—and propose future directions including dataset generation, cross-paradigm integration, and richer reasoning capabilities (e.g., multi-hop and numeric reasoning) to advance practical BQA systems. Overall, the paper emphasizes that while substantial methodological gains have been achieved, bridging the gap to real-world biomedical decision support requires scalable, explainable, and knowledge-integrated QA solutions. The work serves as a comprehensive guide for researchers and practitioners navigating the evolving BQA landscape, and it highlights concrete avenues for building more capable, trustworthy biomedical QA systems.

Abstract

Automatic Question Answering (QA) has been successfully applied in various domains such as search engines and chatbots. Biomedical QA (BQA), as an emerging QA task, enables innovative applications to effectively perceive, access and understand complex biomedical knowledge. There have been tremendous developments of BQA in the past two decades, which we classify into 5 distinctive approaches: classic, information retrieval, machine reading comprehension, knowledge base and question entailment approaches. In this survey, we introduce available datasets and representative methods of each BQA approach in detail. Despite the developments, BQA systems are still immature and rarely used in real-life settings. We identify and characterize several key challenges in BQA that might lead to this issue, and discuss some potential future directions to explore.

Paper Structure

This paper contains 48 sections, 1 equation, 2 figures, 11 tables.

Figures (2)

  • Figure 1: Overview of major biomedical question answering approaches. Boxes indicate methods or resources.
  • Figure 2: An instance of VQA-Med.