FS-RAG: A Frame Semantics Based Approach for Improved Factual Accuracy in Large Language Models
Harish Tayyar Madabushi
TL;DR
The paper addresses factual hallucinations in large language models by proposing FS-RAG, a frame semantics–driven retrieval system that indexes facts by the frames they invoke and uses frame relations to expand retrieval. Built on Frame Semantics and bootstrapped frame generation, FS-RAG aims to provide interpretable, data-driven insights while improving retrieval recall on a closed-domain QA task evaluated via Entailment Bank. The key contributions include a frame-based retrieval pipeline, qualitative and empirical demonstrations of improved retrieval over keyword and LLM-based baselines, and the capacity to provide interpretable outputs that aid debugging and theory refinement. This approach has practical implications for making LLMs more trustworthy in specialized domains by reducing factual hallucinations and enabling frame-aware reasoning.
Abstract
We present a novel extension to Retrieval Augmented Generation with the goal of mitigating factual inaccuracies in the output of large language models. Specifically, our method draws on the cognitive linguistic theory of frame semantics for the indexing and retrieval of factual information relevant to helping large language models answer queries. We conduct experiments to demonstrate the effectiveness of this method both in terms of retrieval effectiveness and in terms of the relevance of the frames and frame relations automatically generated. Our results show that this novel mechanism of Frame Semantic-based retrieval, designed to improve Retrieval Augmented Generation (FS-RAG), is effective and offers potential for providing data-driven insights into frame semantics theory. We provide open access to our program code and prompts.
