Table of Contents
Fetching ...

Automating Intervention Discovery from Scientific Literature: A Progressive Ontology Prompting and Dual-LLM Framework

Yuting Hu, Dancheng Liu, Qingyun Wang, Charles Yu, Chenhui Xu, Qingxiao Zheng, Heng Ji, Jinjun Xiong

TL;DR

The paper tackles the problem of automating intervention discovery from large, domain-specific scientific literature. It introduces Progressive Ontology Prompting (POP) to systematically generate context-aware annotation prompts via a prioritized BFS over a predefined intervention ontology, and a dual-agent LLM framework (LLM-Duo) to iteratively refine annotations through explorer-evaluator dynamics and retrieval augmentation. In a speech-language pathology case study, the approach outperforms strong baselines, discovering 2,421 interventions from 64,177 papers and constructing the first public intervention knowledge base in the field with 33,148 nodes and 324,707 relations. The work demonstrates a scalable, ontology-driven pathway for automated knowledge graph construction in healthcare, offering practical benefits for evidence-based practice, QA, and decision support.

Abstract

Identifying effective interventions from the scientific literature is challenging due to the high volume of publications, specialized terminology, and inconsistent reporting formats, making manual curation laborious and prone to oversight. To address this challenge, this paper proposes a novel framework leveraging large language models (LLMs), which integrates a progressive ontology prompting (POP) algorithm with a dual-agent system, named LLM-Duo. On the one hand, the POP algorithm conducts a prioritized breadth-first search (BFS) across a predefined ontology, generating structured prompt templates and action sequences to guide the automatic annotation process. On the other hand, the LLM-Duo system features two specialized LLM agents, an explorer and an evaluator, working collaboratively and adversarially to continuously refine annotation quality. We showcase the real-world applicability of our framework through a case study focused on speech-language intervention discovery. Experimental results show that our approach surpasses advanced baselines, achieving more accurate and comprehensive annotations through a fully automated process. Our approach successfully identified 2,421 interventions from a corpus of 64,177 research articles in the speech-language pathology domain, culminating in the creation of a publicly accessible intervention knowledge base with great potential to benefit the speech-language pathology community.

Automating Intervention Discovery from Scientific Literature: A Progressive Ontology Prompting and Dual-LLM Framework

TL;DR

The paper tackles the problem of automating intervention discovery from large, domain-specific scientific literature. It introduces Progressive Ontology Prompting (POP) to systematically generate context-aware annotation prompts via a prioritized BFS over a predefined intervention ontology, and a dual-agent LLM framework (LLM-Duo) to iteratively refine annotations through explorer-evaluator dynamics and retrieval augmentation. In a speech-language pathology case study, the approach outperforms strong baselines, discovering 2,421 interventions from 64,177 papers and constructing the first public intervention knowledge base in the field with 33,148 nodes and 324,707 relations. The work demonstrates a scalable, ontology-driven pathway for automated knowledge graph construction in healthcare, offering practical benefits for evidence-based practice, QA, and decision support.

Abstract

Identifying effective interventions from the scientific literature is challenging due to the high volume of publications, specialized terminology, and inconsistent reporting formats, making manual curation laborious and prone to oversight. To address this challenge, this paper proposes a novel framework leveraging large language models (LLMs), which integrates a progressive ontology prompting (POP) algorithm with a dual-agent system, named LLM-Duo. On the one hand, the POP algorithm conducts a prioritized breadth-first search (BFS) across a predefined ontology, generating structured prompt templates and action sequences to guide the automatic annotation process. On the other hand, the LLM-Duo system features two specialized LLM agents, an explorer and an evaluator, working collaboratively and adversarially to continuously refine annotation quality. We showcase the real-world applicability of our framework through a case study focused on speech-language intervention discovery. Experimental results show that our approach surpasses advanced baselines, achieving more accurate and comprehensive annotations through a fully automated process. Our approach successfully identified 2,421 interventions from a corpus of 64,177 research articles in the speech-language pathology domain, culminating in the creation of a publicly accessible intervention knowledge base with great potential to benefit the speech-language pathology community.
Paper Structure (25 sections, 3 equations, 8 figures, 4 tables, 1 algorithm)

This paper contains 25 sections, 3 equations, 8 figures, 4 tables, 1 algorithm.

Figures (8)

  • Figure 1: Illustration of prompt design and scheduling based on the progressive ontology prompting algorithm.
  • Figure 2: Iterative annotation with two LLM agents under the LLM-Duo framework.
  • Figure 3: Annotation examples of speech-language intervention discovery using the LLM-Duo framework.
  • Figure 4: Ontology of speech-language intervention.
  • Figure 5: Evaluation of 'participant' annotation with POP of different context sizes.
  • ...and 3 more figures