Table of Contents
Fetching ...

Knowledge AI: Fine-tuning NLP Models for Facilitating Scientific Knowledge Extraction and Understanding

Balaji Muralidharan, Hayden Beadles, Reza Marzban, Kalyan Sashank Mupparaju

TL;DR

Knowledge AI develops a domain-specific fine-tuning framework for scientific NLP tasks, aiming to democratize access to scientific knowledge. By adapting LLMs to four core tasks—summarization, text generation, question answering (extractive and abstractive), and named entity recognition—the study demonstrates significant performance gains over baselines, with clear trade-offs between full fine-tuning and parameter-efficient approaches like LoRA. The approach leverages adaptive tokenization, Longformer-extensions for long documents, and domain-pretraining (e.g., SciBERT) to boost effectiveness in scientific contexts, using datasets from ArXiv, PubMedQA, SQuAD, CoNLL2003, SciERC, and GENIA. Overall, Knowledge AI highlights the practical viability of fine-tuned LLMs for scientific knowledge extraction and dissemination, offering a foundation for accessible science communication and knowledge discovery, while noting efficiency and scalability considerations for real-world deployment.

Abstract

This project investigates the efficacy of Large Language Models (LLMs) in understanding and extracting scientific knowledge across specific domains and to create a deep learning framework: Knowledge AI. As a part of this framework, we employ pre-trained models and fine-tune them on datasets in the scientific domain. The models are adapted for four key Natural Language Processing (NLP) tasks: summarization, text generation, question answering, and named entity recognition. Our results indicate that domain-specific fine-tuning significantly enhances model performance in each of these tasks, thereby improving their applicability for scientific contexts. This adaptation enables non-experts to efficiently query and extract information within targeted scientific fields, demonstrating the potential of fine-tuned LLMs as a tool for knowledge discovery in the sciences.

Knowledge AI: Fine-tuning NLP Models for Facilitating Scientific Knowledge Extraction and Understanding

TL;DR

Knowledge AI develops a domain-specific fine-tuning framework for scientific NLP tasks, aiming to democratize access to scientific knowledge. By adapting LLMs to four core tasks—summarization, text generation, question answering (extractive and abstractive), and named entity recognition—the study demonstrates significant performance gains over baselines, with clear trade-offs between full fine-tuning and parameter-efficient approaches like LoRA. The approach leverages adaptive tokenization, Longformer-extensions for long documents, and domain-pretraining (e.g., SciBERT) to boost effectiveness in scientific contexts, using datasets from ArXiv, PubMedQA, SQuAD, CoNLL2003, SciERC, and GENIA. Overall, Knowledge AI highlights the practical viability of fine-tuned LLMs for scientific knowledge extraction and dissemination, offering a foundation for accessible science communication and knowledge discovery, while noting efficiency and scalability considerations for real-world deployment.

Abstract

This project investigates the efficacy of Large Language Models (LLMs) in understanding and extracting scientific knowledge across specific domains and to create a deep learning framework: Knowledge AI. As a part of this framework, we employ pre-trained models and fine-tune them on datasets in the scientific domain. The models are adapted for four key Natural Language Processing (NLP) tasks: summarization, text generation, question answering, and named entity recognition. Our results indicate that domain-specific fine-tuning significantly enhances model performance in each of these tasks, thereby improving their applicability for scientific contexts. This adaptation enables non-experts to efficiently query and extract information within targeted scientific fields, demonstrating the potential of fine-tuned LLMs as a tool for knowledge discovery in the sciences.
Paper Structure (35 sections, 3 figures, 6 tables)

This paper contains 35 sections, 3 figures, 6 tables.

Figures (3)

  • Figure 1: In the left image, we show test set ROUGE and METEOR performance on the PubMedQA dataset. Of these, BioGPT appears to do best, but it is misleading. We took a sample of the test set and compared answers in the image on the right. BioGPT and BART repeat the question in the answer, and only SciBERT appears strongest and responds to the question. It scored well in this particular sample of 10 questions. Please go here \ref{['abstractive_qa_output']} to see output in model question / answers.
  • Figure 2: Comparison of F1 score for NER task finetuned with a full and reduced SciERC dataset with BERT model.
  • Figure 3: Example output from NER task that shows the named entities present in the input text.