Table of Contents
Fetching ...

Enhancing Biomedical Named Entity Recognition using GLiNER-BioMed with Targeted Dictionary-Based Post-processing for BioASQ 2025 task 6

Ritesh Mehta

TL;DR

The paper tackles BioNER for PubMed abstracts on the gut-brain axis (BioASQ Task 6.1) by fine-tuning GLiNER-BioMed across 13 entity types and applying a dictionary-based post-processing strategy to correct common misclassifications. On the development set, micro-$F1$-score rose from $0.7857$ to $0.8316$, but on the official test set it dropped to $0.7743$, revealing a generalization gap and possible overfitting of the post-processing rules. Comparative experiments show CRF-based approaches underperform relative to transformer-based models, reinforcing the strength of pre-trained biomedical transformers while highlighting the sensitivity of rule-based refinements to data shifts. The work emphasizes the importance of generalizable post-processing and outlines future directions, including leveraging larger ontologies, context-aware corrections, ensembles, and knowledge-base linking to improve robustness and downstream applicability.

Abstract

Biomedical Named Entity Recognition (BioNER), task6 in BioASQ (A challenge in large-scale biomedical semantic indexing and question answering), is crucial for extracting information from scientific literature but faces hurdles such as distinguishing between similar entity types like genes and chemicals. This study evaluates the GLiNER-BioMed model on a BioASQ dataset and introduces a targeted dictionary-based post-processing strategy to address common misclassifications. While this post-processing approach demonstrated notable improvement on our development set, increasing the micro F1-score from a baseline of 0.79 to 0.83, this enhancement did not generalize to the blind test set, where the post-processed model achieved a micro F1-score of 0.77 compared to the baselines 0.79. We also discuss insights gained from exploring alternative methodologies, including Conditional Random Fields. This work highlights the potential of dictionary-based refinement for pre-trained BioNER models but underscores the critical challenge of overfitting to development data and the necessity of ensuring robust generalization for real-world applicability.

Enhancing Biomedical Named Entity Recognition using GLiNER-BioMed with Targeted Dictionary-Based Post-processing for BioASQ 2025 task 6

TL;DR

The paper tackles BioNER for PubMed abstracts on the gut-brain axis (BioASQ Task 6.1) by fine-tuning GLiNER-BioMed across 13 entity types and applying a dictionary-based post-processing strategy to correct common misclassifications. On the development set, micro--score rose from to , but on the official test set it dropped to , revealing a generalization gap and possible overfitting of the post-processing rules. Comparative experiments show CRF-based approaches underperform relative to transformer-based models, reinforcing the strength of pre-trained biomedical transformers while highlighting the sensitivity of rule-based refinements to data shifts. The work emphasizes the importance of generalizable post-processing and outlines future directions, including leveraging larger ontologies, context-aware corrections, ensembles, and knowledge-base linking to improve robustness and downstream applicability.

Abstract

Biomedical Named Entity Recognition (BioNER), task6 in BioASQ (A challenge in large-scale biomedical semantic indexing and question answering), is crucial for extracting information from scientific literature but faces hurdles such as distinguishing between similar entity types like genes and chemicals. This study evaluates the GLiNER-BioMed model on a BioASQ dataset and introduces a targeted dictionary-based post-processing strategy to address common misclassifications. While this post-processing approach demonstrated notable improvement on our development set, increasing the micro F1-score from a baseline of 0.79 to 0.83, this enhancement did not generalize to the blind test set, where the post-processed model achieved a micro F1-score of 0.77 compared to the baselines 0.79. We also discuss insights gained from exploring alternative methodologies, including Conditional Random Fields. This work highlights the potential of dictionary-based refinement for pre-trained BioNER models but underscores the critical challenge of overfitting to development data and the necessity of ensuring robust generalization for real-world applicability.

Paper Structure

This paper contains 30 sections, 5 figures, 1 table.

Figures (5)

  • Figure 1: Entity label count in development set
  • Figure 2: Misclassifications between semantically similar or related entity types
  • Figure 3: Prediction of overly generic terms as specific entities
  • Figure 4: Suboptimal span boundaries
  • Figure 5: Flowchart of the BioNER system pipeline