Table of Contents
Fetching ...

Multi-label Sequential Sentence Classification via Large Language Model

Mengfei Lan, Lecheng Zheng, Shufan Ming, Halil Kilicoglu

TL;DR

The paper addresses the challenge of performing sequential sentence classification (SSC) in scientific texts, especially in a multi-label setting, by introducing LLM-SSC, a framework that leverages large language models through demonstration-based prompts and in-context learning as well as parameter-efficient fine-tuning with LoRA. It integrates a space-thinking mechanism and an auto-weighting multi-label contrastive loss (WeighCon) to better handle multi-label predictions and label relationships. A new manually annotated multi-label SSC dataset, BIORC800, is released to support evaluation in unstructured biomedical abstracts, and extensive experiments show strong performance across in-context and tuning regimes, with ablations confirming the contributions of each component. The work provides practical insights into prompt design, tuning strategies, and dataset creation for scalable SSC in scientific domains, with potential impact on information retrieval and summarization tasks.

Abstract

Sequential sentence classification (SSC) in scientific publications is crucial for supporting downstream tasks such as fine-grained information retrieval and extractive summarization. However, current SSC methods are constrained by model size, sequence length, and single-label setting. To address these limitations, this paper proposes LLM-SSC, a large language model (LLM)-based framework for both single- and multi-label SSC tasks. Unlike previous approaches that employ small- or medium-sized language models, the proposed framework utilizes LLMs to generate SSC labels through designed prompts, which enhance task understanding by incorporating demonstrations and a query to describe the prediction target. We also present a multi-label contrastive learning loss with auto-weighting scheme, enabling the multi-label classification task. To support our multi-label SSC analysis, we introduce and release a new dataset, biorc800, which mainly contains unstructured abstracts in the biomedical domain with manual annotations. Experiments demonstrate LLM-SSC's strong performance in SSC under both in-context learning and task-specific tuning settings. We release biorc800 and our code at: https://github.com/ScienceNLP-Lab/LLM-SSC.

Multi-label Sequential Sentence Classification via Large Language Model

TL;DR

The paper addresses the challenge of performing sequential sentence classification (SSC) in scientific texts, especially in a multi-label setting, by introducing LLM-SSC, a framework that leverages large language models through demonstration-based prompts and in-context learning as well as parameter-efficient fine-tuning with LoRA. It integrates a space-thinking mechanism and an auto-weighting multi-label contrastive loss (WeighCon) to better handle multi-label predictions and label relationships. A new manually annotated multi-label SSC dataset, BIORC800, is released to support evaluation in unstructured biomedical abstracts, and extensive experiments show strong performance across in-context and tuning regimes, with ablations confirming the contributions of each component. The work provides practical insights into prompt design, tuning strategies, and dataset creation for scalable SSC in scientific domains, with potential impact on information retrieval and summarization tasks.

Abstract

Sequential sentence classification (SSC) in scientific publications is crucial for supporting downstream tasks such as fine-grained information retrieval and extractive summarization. However, current SSC methods are constrained by model size, sequence length, and single-label setting. To address these limitations, this paper proposes LLM-SSC, a large language model (LLM)-based framework for both single- and multi-label SSC tasks. Unlike previous approaches that employ small- or medium-sized language models, the proposed framework utilizes LLMs to generate SSC labels through designed prompts, which enhance task understanding by incorporating demonstrations and a query to describe the prediction target. We also present a multi-label contrastive learning loss with auto-weighting scheme, enabling the multi-label classification task. To support our multi-label SSC analysis, we introduce and release a new dataset, biorc800, which mainly contains unstructured abstracts in the biomedical domain with manual annotations. Experiments demonstrate LLM-SSC's strong performance in SSC under both in-context learning and task-specific tuning settings. We release biorc800 and our code at: https://github.com/ScienceNLP-Lab/LLM-SSC.

Paper Structure

This paper contains 39 sections, 8 equations, 3 figures, 6 tables.

Figures (3)

  • Figure 1: Structure of our LLM-based in-context learning and finetuning for SSC.
  • Figure 2: biorc800 1-shot Prompt
  • Figure 4: PubMed 20K RCT 1-shot Prompt