LSTM-based Deep Neural Network With A Focus on Sentence Representation for Sequential Sentence Classification in Medical Scientific Abstracts
Phat Lam, Lam Pham, Tin Nguyen, Hieu Tang, Michael Seidl, Medina Andresel, Alexander Schindler
TL;DR
This work addresses Sequential Sentence Classification in medical abstracts by focusing on robust sentence representations. It introduces a three-part architecture: a sentence-level Sen-Model that fuses word, character, statistical, and BiomedBERT features through SBA blocks to produce sentence embeddings; an abstract-level Abs-Model (C-RNN with Bi-RNN decoder) to capture sequential context across sentences; and a segment-level Seg-Model (MLP) operating on fixed-length sentence segments. Joint inference combines abstract- and segment-level predictions, achieving state-competitive F1 scores and surpassing the baseline by up to 2.8 percentage points on key datasets. The approach demonstrates the value of task-focused sentence representations in boosting SSC performance and offers a strong foundation for future enhancements in higher contextual levels.
Abstract
The Sequential Sentence Classification task within the domain of medical abstracts, termed as SSC, involves the categorization of sentences into pre-defined headings based on their roles in conveying critical information in the abstract. In the SSC task, sentences are sequentially related to each other. For this reason, the role of sentence embeddings is crucial for capturing both the semantic information between words in the sentence and the contextual relationship of sentences within the abstract, which then enhances the SSC system performance. In this paper, we propose a LSTM-based deep learning network with a focus on creating comprehensive sentence representation at the sentence level. To demonstrate the efficacy of the created sentence representation, a system utilizing these sentence embeddings is also developed, which consists of a Convolutional-Recurrent neural network (C-RNN) at the abstract level and a multi-layer perception network (MLP) at the segment level. Our proposed system yields highly competitive results compared to state-of-the-art systems and further enhances the F1 scores of the baseline by 1.0%, 2.8%, and 2.6% on the benchmark datasets PudMed 200K RCT, PudMed 20K RCT and NICTA-PIBOSO, respectively. This indicates the significant impact of improving sentence representation on boosting model performance.
