Table of Contents
Fetching ...

Ingest-And-Ground: Dispelling Hallucinations from Continually-Pretrained LLMs with RAG

Chenhao Fang, Derek Larson, Shitong Zhu, Sophie Zeng, Wendy Summer, Yanqing Peng, Yuriy Hulovatyy, Rajeev Rao, Gabriel Forgues, Arya Pudota, Alex Goncalves, Hervé Robert

TL;DR

This paper continually pre-train the base LLM model with a privacy-specific knowledge base and then augment it with a semantic RAG layer to reduce hallucination and to improve privacy process efficiency with LLM and RAG.

Abstract

This paper presents new methods that have the potential to improve privacy process efficiency with LLM and RAG. To reduce hallucination, we continually pre-train the base LLM model with a privacy-specific knowledge base and then augment it with a semantic RAG layer. Our evaluations demonstrate that this approach enhances the model performance (as much as doubled metrics compared to out-of-box LLM) in handling privacy-related queries, by grounding responses with factual information which reduces inaccuracies.

Ingest-And-Ground: Dispelling Hallucinations from Continually-Pretrained LLMs with RAG

TL;DR

This paper continually pre-train the base LLM model with a privacy-specific knowledge base and then augment it with a semantic RAG layer to reduce hallucination and to improve privacy process efficiency with LLM and RAG.

Abstract

This paper presents new methods that have the potential to improve privacy process efficiency with LLM and RAG. To reduce hallucination, we continually pre-train the base LLM model with a privacy-specific knowledge base and then augment it with a semantic RAG layer. Our evaluations demonstrate that this approach enhances the model performance (as much as doubled metrics compared to out-of-box LLM) in handling privacy-related queries, by grounding responses with factual information which reduces inaccuracies.
Paper Structure (4 sections, 2 figures)

This paper contains 4 sections, 2 figures.

Figures (2)

  • Figure 1: System overview of $\text{PrivacyBrain}$: how we factually ground LLM hallucinations with RAG
  • Figure 2: Performance comparison with baselines