Table of Contents
Fetching ...

Knowledge Bases in Support of Large Language Models for Processing Web News

Yihe Zhang, Nabin Pakka, Nian-Feng Tzeng

TL;DR

A general framework that permits to build knowledge bases with an aid of LLMs, tailored for processing Web news, evaluated under different news-related datasets for news category classification, with promising experimental results.

Abstract

Large Language Models (LLMs) have received considerable interest in wide applications lately. During pre-training via massive datasets, such a model implicitly memorizes the factual knowledge of trained datasets in its hidden parameters. However, knowledge held implicitly in parameters often makes its use by downstream applications ineffective due to the lack of common-sense reasoning. In this article, we introduce a general framework that permits to build knowledge bases with an aid of LLMs, tailored for processing Web news. The framework applies a rule-based News Information Extractor (NewsIE) to news items for extracting their relational tuples, referred to as knowledge bases, which are then graph-convoluted with the implicit knowledge facts of news items obtained by LLMs, for their classification. It involves two lightweight components: 1) NewsIE: for extracting the structural information of every news item, in the form of relational tuples; 2) BERTGraph: for graph convoluting the implicit knowledge facts with relational tuples extracted by NewsIE. We have evaluated our framework under different news-related datasets for news category classification, with promising experimental results.

Knowledge Bases in Support of Large Language Models for Processing Web News

TL;DR

A general framework that permits to build knowledge bases with an aid of LLMs, tailored for processing Web news, evaluated under different news-related datasets for news category classification, with promising experimental results.

Abstract

Large Language Models (LLMs) have received considerable interest in wide applications lately. During pre-training via massive datasets, such a model implicitly memorizes the factual knowledge of trained datasets in its hidden parameters. However, knowledge held implicitly in parameters often makes its use by downstream applications ineffective due to the lack of common-sense reasoning. In this article, we introduce a general framework that permits to build knowledge bases with an aid of LLMs, tailored for processing Web news. The framework applies a rule-based News Information Extractor (NewsIE) to news items for extracting their relational tuples, referred to as knowledge bases, which are then graph-convoluted with the implicit knowledge facts of news items obtained by LLMs, for their classification. It involves two lightweight components: 1) NewsIE: for extracting the structural information of every news item, in the form of relational tuples; 2) BERTGraph: for graph convoluting the implicit knowledge facts with relational tuples extracted by NewsIE. We have evaluated our framework under different news-related datasets for news category classification, with promising experimental results.

Paper Structure

This paper contains 24 sections, 5 figures, 3 tables.

Figures (5)

  • Figure 1: Illustration of Knowledge Base (left) and Large Language Model (right), with the former typically storing structured knowledge explicitly and the latter holding unstructured knowledge implicitly.
  • Figure 2: Overview of BERTGraph framework.
  • Figure 3: The design of Text-to-Graph Adapter.
  • Figure 4: The sequence of clause type identification.
  • Figure 5: F1 score, accuracy, precision under various training amounts (in %) of datasets.