Table of Contents
Fetching ...

Exploring Prompt-Based Methods for Zero-Shot Hypernym Prediction with Large Language Models

Mikhail Tikhomirov, Natalia Loukachevitch

TL;DR

This work studies zero-shot hypernym prediction using prompt-based LLMs, estimating hypernym probabilities from carefully crafted prompts and comparing full-sequence versus selective-token scores. It demonstrates that prompt effectiveness correlates with classic pattern-based signals and shows that co-hyponym prompts and co-hyponym augmentation can boost zero-shot hypernym predictions. Additionally, an iterative ranking approach traversing hypernym chains further improves taxonomic predictions, achieving a MAP of 0.8 on the BLESS dataset. The findings offer practical guidance for prompt design, model selection, and hierarchical taxonomy induction in zero-shot settings, with potential impact on semantic reasoning and knowledge organization tasks.

Abstract

This article investigates a zero-shot approach to hypernymy prediction using large language models (LLMs). The study employs a method based on text probability calculation, applying it to various generated prompts. The experiments demonstrate a strong correlation between the effectiveness of language model prompts and classic patterns, indicating that preliminary prompt selection can be carried out using smaller models before moving to larger ones. We also explore prompts for predicting co-hyponyms and improving hypernymy predictions by augmenting prompts with additional information through automatically identified co-hyponyms. An iterative approach is developed for predicting higher-level concepts, which further improves the quality on the BLESS dataset (MAP = 0.8).

Exploring Prompt-Based Methods for Zero-Shot Hypernym Prediction with Large Language Models

TL;DR

This work studies zero-shot hypernym prediction using prompt-based LLMs, estimating hypernym probabilities from carefully crafted prompts and comparing full-sequence versus selective-token scores. It demonstrates that prompt effectiveness correlates with classic pattern-based signals and shows that co-hyponym prompts and co-hyponym augmentation can boost zero-shot hypernym predictions. Additionally, an iterative ranking approach traversing hypernym chains further improves taxonomic predictions, achieving a MAP of 0.8 on the BLESS dataset. The findings offer practical guidance for prompt design, model selection, and hierarchical taxonomy induction in zero-shot settings, with potential impact on semantic reasoning and knowledge organization tasks.

Abstract

This article investigates a zero-shot approach to hypernymy prediction using large language models (LLMs). The study employs a method based on text probability calculation, applying it to various generated prompts. The experiments demonstrate a strong correlation between the effectiveness of language model prompts and classic patterns, indicating that preliminary prompt selection can be carried out using smaller models before moving to larger ones. We also explore prompts for predicting co-hyponyms and improving hypernymy predictions by augmenting prompts with additional information through automatically identified co-hyponyms. An iterative approach is developed for predicting higher-level concepts, which further improves the quality on the BLESS dataset (MAP = 0.8).
Paper Structure (22 sections, 4 equations, 2 figures, 12 tables)

This paper contains 22 sections, 4 equations, 2 figures, 12 tables.

Figures (2)

  • Figure 1: Probability calculation scheme for full and selective variants
  • Figure 2: Example of iterative approach