Detection of fields of applications in biomedical abstracts with the support of argumentation elements
Mariana Neves
TL;DR
This work addresses extracting fields of application from biomedical publications by leveraging argumentative elements rather than full abstracts. It builds a multiclass, multilabel corpus of roughly 2,000 biomedical abstracts across eight labels and fine-tunes PubMedBERT to predict these labels, comparing use of title+abstract against specific argumentative components. Three argument-mining tools (ArguminSci, HSLN, MARGOT) are evaluated, with best results arising from selecting informative argumentative elements—most notably the Conclusion component—achieving up to F1 ≈ 0.84 on some labels. The study provides a new resource and benchmarking insights for domain-specific biomedical text classification, highlighting the practical potential of targeted argumentative signals to improve information retrieval and analysis in biomedicine.
Abstract
Focusing on particular facts, instead of the complete text, can potentially improve searching for specific information in the scientific literature. In particular, argumentative elements allow focusing on specific parts of a publication, e.g., the background section or the claims from the authors. We evaluated some tools for the extraction of argumentation elements for a specific task in biomedicine, namely, for detecting the fields of the application in a biomedical publication, e.g, whether it addresses the problem of disease diagnosis or drug development. We performed experiments with the PubMedBERT pre-trained model, which was fine-tuned on a specific corpus for the task. We compared the use of title and abstract to restricting to only some argumentative elements. The top F1 scores ranged from 0.22 to 0.84, depending on the field of application. The best argumentative labels were the ones related the conclusion and background sections of an abstract.
