Table of Contents
Fetching ...

NELLIE: A Neuro-Symbolic Inference Engine for Grounded, Compositional, and Explainable Reasoning

Nathaniel Weir, Peter Clark, Benjamin Van Durme

TL;DR

Nellie tackles the challenge of interpretable QA by grounding answers in an external NL fact corpus through end-to-end entailment-tree proofs, addressing both interpretability and hallucination in large language models. It combines a Prolog-style backward-chaining prover with neural predicates, retrieval-guided rule generation, and template-conditioned decomposition to produce grounded, explainable proofs. The architecture, evaluated on WorldTree and EntailmentBank, matches or surpasses similar-sized baselines and demonstrates domain generalization to OpenBookQA when common knowledge is incorporated, highlighting the practical value of jointly leveraging neural reasoning and symbolic inference. Collectively, Nellie demonstrates a scalable, grounded approach to explainable QA that bridges modern neural methods with traditional symbolic reasoning, enabling robust, human-interpretable explanations grounded in textual knowledge.

Abstract

Our goal is a modern approach to answering questions via systematic reasoning where answers are supported by human interpretable proof trees grounded in an NL corpus of authoritative facts. Such a system would help alleviate the challenges of interpretability and hallucination with modern LMs, and the lack of grounding of current explanation methods (e.g., Chain-of-Thought). This paper proposes a new take on Prolog-based inference engines, where we replace handcrafted rules with a combination of neural language modeling, guided generation, and semiparametric dense retrieval. Our implementation, NELLIE, is the first system to demonstrate fully interpretable, end-to-end grounded QA as entailment tree proof search, going beyond earlier work explaining known-to-be-true facts from text. In experiments, NELLIE outperforms a similar-sized state-of-the-art reasoner [Tafjord et al., 2022] while producing knowledge-grounded explanations. We also find NELLIE can exploit both semi-structured and NL text corpora to guide reasoning. Together these suggest a new way to jointly reap the benefits of both modern neural methods and traditional symbolic reasoning.

NELLIE: A Neuro-Symbolic Inference Engine for Grounded, Compositional, and Explainable Reasoning

TL;DR

Nellie tackles the challenge of interpretable QA by grounding answers in an external NL fact corpus through end-to-end entailment-tree proofs, addressing both interpretability and hallucination in large language models. It combines a Prolog-style backward-chaining prover with neural predicates, retrieval-guided rule generation, and template-conditioned decomposition to produce grounded, explainable proofs. The architecture, evaluated on WorldTree and EntailmentBank, matches or surpasses similar-sized baselines and demonstrates domain generalization to OpenBookQA when common knowledge is incorporated, highlighting the practical value of jointly leveraging neural reasoning and symbolic inference. Collectively, Nellie demonstrates a scalable, grounded approach to explainable QA that bridges modern neural methods with traditional symbolic reasoning, enabling robust, human-interpretable explanations grounded in textual knowledge.

Abstract

Our goal is a modern approach to answering questions via systematic reasoning where answers are supported by human interpretable proof trees grounded in an NL corpus of authoritative facts. Such a system would help alleviate the challenges of interpretability and hallucination with modern LMs, and the lack of grounding of current explanation methods (e.g., Chain-of-Thought). This paper proposes a new take on Prolog-based inference engines, where we replace handcrafted rules with a combination of neural language modeling, guided generation, and semiparametric dense retrieval. Our implementation, NELLIE, is the first system to demonstrate fully interpretable, end-to-end grounded QA as entailment tree proof search, going beyond earlier work explaining known-to-be-true facts from text. In experiments, NELLIE outperforms a similar-sized state-of-the-art reasoner [Tafjord et al., 2022] while producing knowledge-grounded explanations. We also find NELLIE can exploit both semi-structured and NL text corpora to guide reasoning. Together these suggest a new way to jointly reap the benefits of both modern neural methods and traditional symbolic reasoning.
Paper Structure (37 sections, 2 equations, 11 figures, 1 table)

This paper contains 37 sections, 2 equations, 11 figures, 1 table.

Figures (11)

  • Figure 1: Given a query, Nellie performs a neuro-symbolic backward chaining search for proof trees whose leaves are grounded in a corpus of facts. It generates candidate decomposition rules guided by retrieved facts or templates. Then, it recursively tries to prove rule conditions via entailment from the corpus or further decomposition.
  • Figure 2: Comparison of approaches to neural XQA. Each approach leads to NL graphs in support of a query. Our proposal is to produce logically directed explanations containing model-generated intermediate steps while grounding a tree in verified facts without relying on handwritten horn clauses.
  • Figure 3: Proposed system framework. An off-the-shelf theorem prover searches for proofs of query $\textsc{Prove}(h)$, where symbol $h$ is an NL hypothesis translated from a QA pair. The prover uses a set of meta-axioms invoking neural retrieval, entailment, and generation predicates to dynamically instantiate inference rules that use the NL factbase.
  • Figure 4: Example question in which a sub-hypothesis in the proof tree is grounded to the question context rather than to the factbase.
  • Figure 5: Sample of WorldTree templates used for guided generation.
  • ...and 6 more figures