Utilizing GPT to Enhance Text Summarization: A Strategy to Minimize Hallucinations
Hassan Shakil, Zeydy Ortiz, Grant C. Forbes
TL;DR
This work tackles hallucinations in AI-generated text summaries by integrating extractive, abstractive, and hybrid approaches (DistilBERT and T5) with a GPT-based refinement stage. The pipeline generates unrefined summaries, refines them via prompted GPT evaluations, and assesses improvements using a diverse metric set including FactSumm, QAGS, SummaC, ROUGE, and GPT-3.5 Turbo derived analyses. Results show significant gains in factual consistency and hallucination reduction, particularly for abstractive and hybrid summaries, though some metrics exhibit mixed responses. The study highlights the need for evaluation frameworks that better capture semantic and factual fidelity in large language model assisted summarization and proposes a practical path toward more reliable automatic summarization systems.
Abstract
In this research, we uses the DistilBERT model to generate extractive summary and the T5 model to generate abstractive summaries. Also, we generate hybrid summaries by combining both DistilBERT and T5 models. Central to our research is the implementation of GPT-based refining process to minimize the common problem of hallucinations that happens in AI-generated summaries. We evaluate unrefined summaries and, after refining, we also assess refined summaries using a range of traditional and novel metrics, demonstrating marked improvements in the accuracy and reliability of the summaries. Results highlight significant improvements in reducing hallucinatory content, thereby increasing the factual integrity of the summaries.
