Table of Contents
Fetching ...

Genetic Approach to Mitigate Hallucination in Generative IR

Hrishikesh Kulkarni, Nazli Goharian, Ophir Frieder, Sean MacAvaney

TL;DR

This work adapts an existing genetic generation approach with a new 'balanced fitness function' consisting of a cross-encoder model for relevance and an n-gram overlap metric to promote grounding that quadruples the grounded answer generation accuracy while maintaining high relevance.

Abstract

Generative language models hallucinate. That is, at times, they generate factually flawed responses. These inaccuracies are particularly insidious because the responses are fluent and well-articulated. We focus on the task of Grounded Answer Generation (part of Generative IR), which aims to produce direct answers to a user's question based on results retrieved from a search engine. We address hallucination by adapting an existing genetic generation approach with a new 'balanced fitness function' consisting of a cross-encoder model for relevance and an n-gram overlap metric to promote grounding. Our balanced fitness function approach quadruples the grounded answer generation accuracy while maintaining high relevance.

Genetic Approach to Mitigate Hallucination in Generative IR

TL;DR

This work adapts an existing genetic generation approach with a new 'balanced fitness function' consisting of a cross-encoder model for relevance and an n-gram overlap metric to promote grounding that quadruples the grounded answer generation accuracy while maintaining high relevance.

Abstract

Generative language models hallucinate. That is, at times, they generate factually flawed responses. These inaccuracies are particularly insidious because the responses are fluent and well-articulated. We focus on the task of Grounded Answer Generation (part of Generative IR), which aims to produce direct answers to a user's question based on results retrieved from a search engine. We address hallucination by adapting an existing genetic generation approach with a new 'balanced fitness function' consisting of a cross-encoder model for relevance and an n-gram overlap metric to promote grounding. Our balanced fitness function approach quadruples the grounded answer generation accuracy while maintaining high relevance.
Paper Structure (15 sections, 1 equation, 3 figures, 3 tables, 1 algorithm)

This paper contains 15 sections, 1 equation, 3 figures, 3 tables, 1 algorithm.

Figures (3)

  • Figure 1: System Architecture. Answer: a, Query: q, Seed documents: seed.
  • Figure 2: Comparison between GPT-3 and GAuGE in mitigating hallucinations across datasets
  • Figure 3: Hallucination mitigation and Relevance with GPT-3 based GAuGE using three Rouge metrics. Relevance comparisons are with respect to top retrieved results.