A Language Model based Framework for New Concept Placement in Ontologies

Hang Dong; Jiaoyan Chen; Yuan He; Yongsheng Gao; Ian Horrocks

A Language Model based Framework for New Concept Placement in Ontologies

Hang Dong, Jiaoyan Chen, Yuan He, Yongsheng Gao, Ian Horrocks

TL;DR

This study shows the advantages of PLMs and highlights the encouraging performance of LLMs that motivates future studies, and proposes explainable instruction tuning of LLMs for improved performance.

Abstract

We investigate the task of inserting new concepts extracted from texts into an ontology using language models. We explore an approach with three steps: edge search which is to find a set of candidate locations to insert (i.e., subsumptions between concepts), edge formation and enrichment which leverages the ontological structure to produce and enhance the edge candidates, and edge selection which eventually locates the edge to be placed into. In all steps, we propose to leverage neural methods, where we apply embedding-based methods and contrastive learning with Pre-trained Language Models (PLMs) such as BERT for edge search, and adapt a BERT fine-tuning-based multi-label Edge-Cross-encoder, and Large Language Models (LLMs) such as GPT series, FLAN-T5, and Llama 2, for edge selection. We evaluate the methods on recent datasets created using the SNOMED CT ontology and the MedMentions entity linking benchmark. The best settings in our framework use fine-tuned PLM for search and a multi-label Cross-encoder for selection. Zero-shot prompting of LLMs is still not adequate for the task, and we propose explainable instruction tuning of LLMs for improved performance. Our study shows the advantages of PLMs and highlights the encouraging performance of LLMs that motivates future studies.

A Language Model based Framework for New Concept Placement in Ontologies

TL;DR

This study shows the advantages of PLMs and highlights the encouraging performance of LLMs that motivates future studies, and proposes explainable instruction tuning of LLMs for improved performance.

Abstract

Paper Structure (30 sections, 4 equations, 3 figures, 5 tables)

This paper contains 30 sections, 4 equations, 3 figures, 5 tables.

Introduction
Related Work
Ontology Concept Placement
Pre-trained Language Models for Ontology Concept Placement
Problem Statement
Methodology
Edge Search: Searching Seed Concepts or Edges
Concept Search with Fixed Embeddings
Edge Search with Fine-tuning Edge-Bi-encoder
Edge Formation and Enrichment
Edge Formation from Seed Concepts
Edge Enrichment from Seed Edges
Edge Selection
Fine-tuning PLMs: Multi-label Edge-Cross-encoder
Zero-shot Prompting LLMs
...and 15 more sections

Figures (3)

Figure 1: An overall three-step framework for ontology concept placement with LMs.
Figure 2: An example of the edge formation and enrichment process using ontology structure. Edge formation transforms a seed concept into a set of edges, while edge enrichment augments the set of edges one by one. For methods that directly search edges (e.g., Edge-Bi-encoder), no edge formation is needed and only enrichment is applied.
Figure 3: Ablation results on top-50 edge candidates with MM-S14-Disease dataset: (a) Top-left: results with Edge-Bi-encoder, with or without the edge enrichment step, on validation set; (b) Top-right: overall results with Edge-Cross-encoder, with or without contexts; (c) Bottom: overall results after Edge Selection with Llama-2-7B, with explainable instruction-tuning, normal instruction-tuning, or without instruction-tuning.

A Language Model based Framework for New Concept Placement in Ontologies

TL;DR

Abstract

A Language Model based Framework for New Concept Placement in Ontologies

Authors

TL;DR

Abstract

Table of Contents

Figures (3)