Table of Contents
Fetching ...

Information Extraction from Conversation Transcripts: Neuro-Symbolic vs. LLM

Alice Saebom Kwak, Maria Alexeeva, Gus Hahn-Powell, Keith Alcock, Kevin McLaughlin, Doug McCorkle, Gabe McNunn, Mihai Surdeanu

TL;DR

This study directly compares a neuro-symbolic and an LLM-based information extraction system for structured data mining from agricultural dialogue transcripts. Both approaches share ASR and preprocessing but diverge in extraction and ontology grounding, with NS relying on rule-based extraction and symbolic reasoning, and LLM using in-context learning plus verification and grounding. Across nine lengthy interviews in pork, dairy, and crop domains, the LLM-based system achieves higher recall and F1 than the NS system (e.g., total F1: 69.4 vs. 52.7), albeit with slower runtime on CPU and model-dependency concerns, while the NS system offers fast, controllable, context-free extraction at the expense of generalization and maintenance effort. The results highlight a practical trade-off for real-world NLP deployment: maximizing performance versus ensuring efficiency, interpretability, and controllability, and they underscore the importance of considering hidden costs when choosing IE architectures for industry use.

Abstract

The current trend in information extraction (IE) is to rely extensively on large language models, effectively discarding decades of experience in building symbolic or statistical IE systems. This paper compares a neuro-symbolic (NS) and an LLM-based IE system in the agricultural domain, evaluating them on nine interviews across pork, dairy, and crop subdomains. The LLM-based system outperforms the NS one (F1 total: 69.4 vs. 52.7; core: 63.0 vs. 47.2), where total includes all extracted information and core focuses on essential details. However, each system has trade-offs: the NS approach offers faster runtime, greater control, and high accuracy in context-free tasks but lacks generalizability, struggles with contextual nuances, and requires significant resources to develop and maintain. The LLM-based system achieves higher performance, faster deployment, and easier maintenance but has slower runtime, limited control, model dependency and hallucination risks. Our findings highlight the "hidden cost" of deploying NLP systems in real-world applications, emphasizing the need to balance performance, efficiency, and control.

Information Extraction from Conversation Transcripts: Neuro-Symbolic vs. LLM

TL;DR

This study directly compares a neuro-symbolic and an LLM-based information extraction system for structured data mining from agricultural dialogue transcripts. Both approaches share ASR and preprocessing but diverge in extraction and ontology grounding, with NS relying on rule-based extraction and symbolic reasoning, and LLM using in-context learning plus verification and grounding. Across nine lengthy interviews in pork, dairy, and crop domains, the LLM-based system achieves higher recall and F1 than the NS system (e.g., total F1: 69.4 vs. 52.7), albeit with slower runtime on CPU and model-dependency concerns, while the NS system offers fast, controllable, context-free extraction at the expense of generalization and maintenance effort. The results highlight a practical trade-off for real-world NLP deployment: maximizing performance versus ensuring efficiency, interpretability, and controllability, and they underscore the importance of considering hidden costs when choosing IE architectures for industry use.

Abstract

The current trend in information extraction (IE) is to rely extensively on large language models, effectively discarding decades of experience in building symbolic or statistical IE systems. This paper compares a neuro-symbolic (NS) and an LLM-based IE system in the agricultural domain, evaluating them on nine interviews across pork, dairy, and crop subdomains. The LLM-based system outperforms the NS one (F1 total: 69.4 vs. 52.7; core: 63.0 vs. 47.2), where total includes all extracted information and core focuses on essential details. However, each system has trade-offs: the NS approach offers faster runtime, greater control, and high accuracy in context-free tasks but lacks generalizability, struggles with contextual nuances, and requires significant resources to develop and maintain. The LLM-based system achieves higher performance, faster deployment, and easier maintenance but has slower runtime, limited control, model dependency and hallucination risks. Our findings highlight the "hidden cost" of deploying NLP systems in real-world applications, emphasizing the need to balance performance, efficiency, and control.

Paper Structure

This paper contains 53 sections, 3 figures, 4 tables.

Figures (3)

  • Figure 1: Sample conversation transcript simplified for readability, presented as produced by the ASR system before postprocessing. The target identifier to be extracted is in italics. The target values for the identifier are in bold. The ASR errors are in red: the first 4 is spurious, faring must be corrected to farrowing, and Souths to sows.'
  • Figure 2: Generalized pipeline of the two system versions. Both share ASR and preprocessing but diverge afterward. The NS system uses a rule-based approach to extract and assemble identifier-value pairs, with dialogue management linking distant fragments. Grounding relies on embedding similarity and string matching. The LLM-based system segments interviews by topic, extracts information via in-context learning, and verifies results through field validation and hallucination filtering before mapping fields to relevant indicators.
  • Figure 3: A simplified sample rule for extracting compound entities that contain common identifier key words (triggers).