Table of Contents
Fetching ...

Overview of BioASQ 2022: The tenth BioASQ challenge on Large-Scale Biomedical Semantic Indexing and Question Answering

Anastasios Nentidis, Georgios Katsimpras, Eirini Vandorou, Anastasia Krithara, Antonio Miranda-Escalada, Luis Gasco, Martin Krallinger, Georgios Paliouras

TL;DR

The paper reports on BioASQ-10, the tenth edition of a long-running benchmark suite for large-scale biomedical semantic indexing and question answering, as part of CLEF 2022. It details four tasks—10a, 10b, Synergy, and the new Spanish DisTEMIST track—covering English literature indexing, biomedical QA, COVID-19–focused QA collaboration, and multilingual disease annotation with SNOMED-CT grounding. The results show continued progress, with top systems frequently outperforming strong baselines and underlining a shift toward transformer-based approaches, as well as the release of new multilingual resources and corpora. The authors discuss the practical impact for biomedical search and clinical information integration, and outline future directions to expand benchmarks and extend Synergy beyond COVID-19 to other developing problems.

Abstract

This paper presents an overview of the tenth edition of the BioASQ challenge in the context of the Conference and Labs of the Evaluation Forum (CLEF) 2022. BioASQ is an ongoing series of challenges that promotes advances in the domain of large-scale biomedical semantic indexing and question answering. In this edition, the challenge was composed of the three established tasks a, b, and Synergy, and a new task named DisTEMIST for automatic semantic annotation and grounding of diseases from clinical content in Spanish, a key concept for semantic indexing and search engines of literature and clinical records. This year, BioASQ received more than 170 distinct systems from 38 teams in total for the four different tasks of the challenge. As in previous years, the majority of the competing systems outperformed the strong baselines, indicating the continuous advancement of the state-of-the-art in this domain.

Overview of BioASQ 2022: The tenth BioASQ challenge on Large-Scale Biomedical Semantic Indexing and Question Answering

TL;DR

The paper reports on BioASQ-10, the tenth edition of a long-running benchmark suite for large-scale biomedical semantic indexing and question answering, as part of CLEF 2022. It details four tasks—10a, 10b, Synergy, and the new Spanish DisTEMIST track—covering English literature indexing, biomedical QA, COVID-19–focused QA collaboration, and multilingual disease annotation with SNOMED-CT grounding. The results show continued progress, with top systems frequently outperforming strong baselines and underlining a shift toward transformer-based approaches, as well as the release of new multilingual resources and corpora. The authors discuss the practical impact for biomedical search and clinical information integration, and outline future directions to expand benchmarks and extend Synergy beyond COVID-19 to other developing problems.

Abstract

This paper presents an overview of the tenth edition of the BioASQ challenge in the context of the Conference and Labs of the Evaluation Forum (CLEF) 2022. BioASQ is an ongoing series of challenges that promotes advances in the domain of large-scale biomedical semantic indexing and question answering. In this edition, the challenge was composed of the three established tasks a, b, and Synergy, and a new task named DisTEMIST for automatic semantic annotation and grounding of diseases from clinical content in Spanish, a key concept for semantic indexing and search engines of literature and clinical records. This year, BioASQ received more than 170 distinct systems from 38 teams in total for the four different tasks of the challenge. As in previous years, the majority of the competing systems outperformed the strong baselines, indicating the continuous advancement of the state-of-the-art in this domain.
Paper Structure (18 sections, 4 figures, 12 tables)

This paper contains 18 sections, 4 figures, 12 tables.

Figures (4)

  • Figure 1: Overview of the DisTEMIST Shared Task.
  • Figure 2: The micro f-measure (MiF) achieved by systems across different years of the BioASQ challenge. For each test set the MiF score is presented for the best performing system (Top) and the MTI, as well as the average micro f-measure of all the participating systems (Avg).
  • Figure 3: The evaluation scores of the best performing systems in task B, Phase B, for exact answers, across the ten years of the BioASQ challenge. Since BioASQ6 the official measure for Yes/No questions is the macro-averaged F1 score (macro F1), but accuracy (Acc) is also presented as the former official measure. The black dots in 10.6 highlight that these scores are for the additional batch with questions from new experts.
  • Figure 4: Micro-average F1-score distribution of PharmaCoNER, DisTEMIST, CodiEsp, CANTEMIST, and MEDDOCAN NER systems. Themicro-average F1-scores of PharmaCoNER and DisTEMIST-entities baseline are shown in blue and red, respectively.