NOWJ @BioCreative IX ToxHabits: An Ensemble Deep Learning Approach for Detecting Substance Use and Contextual Information in Clinical Texts

Huu-Huy-Hoang Tran; Gia-Bao Duong; Quoc-Viet-Anh Tran; Thi-Hai-Yen Vuong; Hoang-Quynh Le

NOWJ @BioCreative IX ToxHabits: An Ensemble Deep Learning Approach for Detecting Substance Use and Contextual Information in Clinical Texts

Huu-Huy-Hoang Tran, Gia-Bao Duong, Quoc-Viet-Anh Tran, Thi-Hai-Yen Vuong, Hoang-Quynh Le

TL;DR

The paper addresses the challenge of extracting substance-use information from Spanish clinical texts in a low-resource setting. It proposes a multi-output ensemble system built on a BETO-based BERT-CRF architecture with a sentence-filtering pre-processing step and majority-voting ensemble to jointly detect triggers (Tobacco, Cannabis, Alcohol, Drug) and contextual arguments (Type, Method, Amount, Frequency, Duration, History). Evaluated on the ToxHabits dataset of 1,499 Spanish case reports, the approach achieves a best Subtask-1 F1 of 0.94 (precision 0.97) and Subtask-2 F1 of 0.91 (precision ~0.95), with sentence filtering improving precision and overall robustness. The work demonstrates that careful architectural design and ensembling can yield strong performance in domain-specific, low-resource clinical NLP without relying on large language models, offering practical improvements for clinical decision support and public health surveillance in Spanish-speaking contexts.

Abstract

Extracting drug use information from unstructured Electronic Health Records remains a major challenge in clinical Natural Language Processing. While Large Language Models demonstrate advancements, their use in clinical NLP is limited by concerns over trust, control, and efficiency. To address this, we present NOWJ submission to the ToxHabits Shared Task at BioCreative IX. This task targets the detection of toxic substance use and contextual attributes in Spanish clinical texts, a domain-specific, low-resource setting. We propose a multi-output ensemble system tackling both Subtask 1 - ToxNER and Subtask 2 - ToxUse. Our system integrates BETO with a CRF layer for sequence labeling, employs diverse training strategies, and uses sentence filtering to boost precision. Our top run achieved 0.94 F1 and 0.97 precision for Trigger Detection, and 0.91 F1 for Argument Detection.

NOWJ @BioCreative IX ToxHabits: An Ensemble Deep Learning Approach for Detecting Substance Use and Contextual Information in Clinical Texts

TL;DR

Abstract

Paper Structure (19 sections, 4 equations, 2 figures, 5 tables)

This paper contains 19 sections, 4 equations, 2 figures, 5 tables.

Introduction
Methodology
System Architecture
Pre-processing with Sentence Segmentation, Sentence Filtering, and Tokenization.
Multi-output Sequence Labeling with BERT-CRF.
Post-processing with Detokenization, Normalization, and Ensemble Inference.
Multiple Training Strategies for Ensemble
Experiments and Results
Dataset Analysis
Evaluation Metrics
Experimental Settings
Experimental Results
Subtask 1: Trigger Detection (ToxNER)
Subtask 2: Argument Detection (ToxUse)
Discussion
...and 4 more sections

Figures (2)

Figure 1: Overall architecture of our multi-output BERT-CRF ensemble system.
Figure 2: An example of an annotated clinical text snippet from the ToxHabits corpus (Image courtesy of ToxHabits Shared Task Organizers toxhabitsoverview)

NOWJ @BioCreative IX ToxHabits: An Ensemble Deep Learning Approach for Detecting Substance Use and Contextual Information in Clinical Texts

TL;DR

Abstract

NOWJ @BioCreative IX ToxHabits: An Ensemble Deep Learning Approach for Detecting Substance Use and Contextual Information in Clinical Texts

Authors

TL;DR

Abstract

Table of Contents

Figures (2)