Survey: Understand the challenges of MachineLearning Experts using Named EntityRecognition Tools
Florian Freund, Philippe Tamla, Matthias Hemmje
TL;DR
The paper investigates how ML/NLP experts evaluate Named Entity Recognition (NER) tools and the challenges they face when selecting tools for information retrieval in healthcare. It adopts Kasunic's expert-survey framework to design and deploy a questionnaire, collecting 23 responses between August and October 2024. Key findings show that performance is the most important criterion across tools, with cloud-based solutions raising considerations around cost and usability, while reducing the learning burden favors locally installed options; open-source large language models are notably relevant in domain-specific NER tasks. The results inform the development of decision-support systems to help domain experts choose NER tools for Clinical Practice Guidelines (CPG) and highlight the need for flexible, domain-aware evaluation criteria in future tooling.
Abstract
This paper presents a survey based on Kasunic's survey research methodology to identify the criteria used by Machine Learning (ML) experts to evaluate Named Entity Recognition (NER) tools and frameworks. Comparison and selection of NER tools and frameworks is a critical step in leveraging NER for Information Retrieval to support the development of Clinical Practice Guidelines. In addition, this study examines the main challenges faced by ML experts when choosing suitable NER tools and frameworks. Using Nunamaker's methodology, the article begins with an introduction to the topic, contextualizes the research, reviews the state-of-the-art in science and technology, and identifies challenges for an expert survey on NER tools and frameworks. This is followed by a description of the survey's design and implementation. The paper concludes with an evaluation of the survey results and the insights gained, ending with a summary and conclusions.
