Information Extraction from Conversation Transcripts: Neuro-Symbolic vs. LLM
Alice Saebom Kwak, Maria Alexeeva, Gus Hahn-Powell, Keith Alcock, Kevin McLaughlin, Doug McCorkle, Gabe McNunn, Mihai Surdeanu
TL;DR
This study directly compares a neuro-symbolic and an LLM-based information extraction system for structured data mining from agricultural dialogue transcripts. Both approaches share ASR and preprocessing but diverge in extraction and ontology grounding, with NS relying on rule-based extraction and symbolic reasoning, and LLM using in-context learning plus verification and grounding. Across nine lengthy interviews in pork, dairy, and crop domains, the LLM-based system achieves higher recall and F1 than the NS system (e.g., total F1: 69.4 vs. 52.7), albeit with slower runtime on CPU and model-dependency concerns, while the NS system offers fast, controllable, context-free extraction at the expense of generalization and maintenance effort. The results highlight a practical trade-off for real-world NLP deployment: maximizing performance versus ensuring efficiency, interpretability, and controllability, and they underscore the importance of considering hidden costs when choosing IE architectures for industry use.
Abstract
The current trend in information extraction (IE) is to rely extensively on large language models, effectively discarding decades of experience in building symbolic or statistical IE systems. This paper compares a neuro-symbolic (NS) and an LLM-based IE system in the agricultural domain, evaluating them on nine interviews across pork, dairy, and crop subdomains. The LLM-based system outperforms the NS one (F1 total: 69.4 vs. 52.7; core: 63.0 vs. 47.2), where total includes all extracted information and core focuses on essential details. However, each system has trade-offs: the NS approach offers faster runtime, greater control, and high accuracy in context-free tasks but lacks generalizability, struggles with contextual nuances, and requires significant resources to develop and maintain. The LLM-based system achieves higher performance, faster deployment, and easier maintenance but has slower runtime, limited control, model dependency and hallucination risks. Our findings highlight the "hidden cost" of deploying NLP systems in real-world applications, emphasizing the need to balance performance, efficiency, and control.
