Table of Contents
Fetching ...

ZEBRA: Zero-Shot Example-Based Retrieval Augmentation for Commonsense Question Answering

Francesco Maria Molfese, Simone Conia, Riccardo Orlando, Roberto Navigli

TL;DR

ZEBRA is introduced, a zero-shot question answering framework that combines retrieval, case-based reasoning and introspection and dispenses with the need for additional training of the LLM, and consistently outperforms strong LLMs and previous knowledge integration approaches.

Abstract

Current Large Language Models (LLMs) have shown strong reasoning capabilities in commonsense question answering benchmarks, but the process underlying their success remains largely opaque. As a consequence, recent approaches have equipped LLMs with mechanisms for knowledge retrieval, reasoning and introspection, not only to improve their capabilities but also to enhance the interpretability of their outputs. However, these methods require additional training, hand-crafted templates or human-written explanations. To address these issues, we introduce ZEBRA, a zero-shot question answering framework that combines retrieval, case-based reasoning and introspection and dispenses with the need for additional training of the LLM. Given an input question, ZEBRA retrieves relevant question-knowledge pairs from a knowledge base and generates new knowledge by reasoning over the relationships in these pairs. This generated knowledge is then used to answer the input question, improving the model's performance and interpretability. We evaluate our approach across 8 well-established commonsense reasoning benchmarks, demonstrating that ZEBRA consistently outperforms strong LLMs and previous knowledge integration approaches, achieving an average accuracy improvement of up to 4.5 points.

ZEBRA: Zero-Shot Example-Based Retrieval Augmentation for Commonsense Question Answering

TL;DR

ZEBRA is introduced, a zero-shot question answering framework that combines retrieval, case-based reasoning and introspection and dispenses with the need for additional training of the LLM, and consistently outperforms strong LLMs and previous knowledge integration approaches.

Abstract

Current Large Language Models (LLMs) have shown strong reasoning capabilities in commonsense question answering benchmarks, but the process underlying their success remains largely opaque. As a consequence, recent approaches have equipped LLMs with mechanisms for knowledge retrieval, reasoning and introspection, not only to improve their capabilities but also to enhance the interpretability of their outputs. However, these methods require additional training, hand-crafted templates or human-written explanations. To address these issues, we introduce ZEBRA, a zero-shot question answering framework that combines retrieval, case-based reasoning and introspection and dispenses with the need for additional training of the LLM. Given an input question, ZEBRA retrieves relevant question-knowledge pairs from a knowledge base and generates new knowledge by reasoning over the relationships in these pairs. This generated knowledge is then used to answer the input question, improving the model's performance and interpretability. We evaluate our approach across 8 well-established commonsense reasoning benchmarks, demonstrating that ZEBRA consistently outperforms strong LLMs and previous knowledge integration approaches, achieving an average accuracy improvement of up to 4.5 points.
Paper Structure (29 sections, 6 equations, 3 figures, 11 tables)

This paper contains 29 sections, 6 equations, 3 figures, 11 tables.

Figures (3)

  • Figure 1: Performance benefits of using Zebra against standard retrieval augmentation methods for commonsense reasoning across four Large Language Models.
  • Figure 2: The Zebra framework in its entirety. Starting with a question $Q$ and its possible choices $C$, the first step (example retrieval) is to ask the retriever to fetch relevant examples from a collection made of questions along with their choices and associated knowledge explanations $(Q_e, C_e, X_e)$. Then, the model is asked to generate one or more explanations $X$ for the question $Q$ with choices $C$ emulating the relationship in the elements $(Q_e, C_e, X_e)$ of the examples (knowledge generation step). Finally, during the informed reasoning step, the same model is asked to perform question answering on the question $Q$ given the choices $C$ and the generated knowledge explanations $X$.
  • Figure 3: Comparison of the LLMs performance on the CSQA development set using Zebra and direct knowledge retrieval (RACo-based Retrieval) as the number of retrieved examples/knowledge statements $k$ increases.