Adversarial Databases Improve Success in Retrieval-based Large Language Models

Sean Wu; Michael Koo; Li Yo Kao; Andy Black; Lesley Blum; Fabien Scalzo; Ira Kurtz

Adversarial Databases Improve Success in Retrieval-based Large Language Models

Sean Wu, Michael Koo, Li Yo Kao, Andy Black, Lesley Blum, Fabien Scalzo, Ira Kurtz

TL;DR

This study probes whether adversarial background data can improve retrieval-augmented generation (RAG) for open-source LLMs, specifically on nephrology MCQs. By comparing relevant corpora (nephSAP, UpToDate) with adversarial texts (Bible, Random Words) across multiple models, it shows that adversarial data can significantly boost performance for several LLMs, in some cases approaching or matching gains from relevant sources. The improvements are model-dependent, and in some cases adversarial inputs degrade performance, underscoring the role of the model's attention dynamics in RAG. The authors propose an attention-based mechanism as a potential explanation and advocate for broader exploration of retrieval data design, with open-source code and data provided to enable replication and further study.

Abstract

Open-source LLMs have shown great potential as fine-tuned chatbots, and demonstrate robust abilities in reasoning and surpass many existing benchmarks. Retrieval-Augmented Generation (RAG) is a technique for improving the performance of LLMs on tasks that the models weren't explicitly trained on, by leveraging external knowledge databases. Numerous studies have demonstrated the effectiveness of RAG to more successfully accomplish downstream tasks when using vector datasets that consist of relevant background information. It has been implicitly assumed by those in the field that if adversarial background information is utilized in this context, that the success of using a RAG-based approach would be nonexistent or even negatively impact the results. To address this assumption, we tested several open-source LLMs on the ability of RAG to improve their success in answering multiple-choice questions (MCQ) in the medical subspecialty field of Nephrology. Unlike previous studies, we examined the effect of RAG in utilizing both relevant and adversarial background databases. We set up several open-source LLMs, including Llama 3, Phi-3, Mixtral 8x7b, Zephyr$β$, and Gemma 7B Instruct, in a zero-shot RAG pipeline. As adversarial sources of information, text from the Bible and a Random Words generated database were used for comparison. Our data show that most of the open-source LLMs improve their multiple-choice test-taking success as expected when incorporating relevant information vector databases. Surprisingly however, adversarial Bible text significantly improved the success of many LLMs and even random word text improved test taking ability of some of the models. In summary, our results demonstrate for the first time the countertintuitive ability of adversarial information datasets to improve the RAG-based LLM success.

Adversarial Databases Improve Success in Retrieval-based Large Language Models

TL;DR

Abstract

, and Gemma 7B Instruct, in a zero-shot RAG pipeline. As adversarial sources of information, text from the Bible and a Random Words generated database were used for comparison. Our data show that most of the open-source LLMs improve their multiple-choice test-taking success as expected when incorporating relevant information vector databases. Surprisingly however, adversarial Bible text significantly improved the success of many LLMs and even random word text improved test taking ability of some of the models. In summary, our results demonstrate for the first time the countertintuitive ability of adversarial information datasets to improve the RAG-based LLM success.

Paper Structure (25 sections, 1 equation, 3 figures, 6 tables)

This paper contains 25 sections, 1 equation, 3 figures, 6 tables.

Introduction
Zero-Shot Querying
Problem Definition
Methods
Databases and Definition of Irrelevance
Open-Source Large Language Models
Llama 3
Phi-3
Mixtral 8x7b
Gemma
Zephyr$\beta$
Retrieval Augmented Generation
Detailed RAG Workflow
Quantifying LLM Outputs
Regex Pattern Matching
...and 10 more sections

Figures (3)

Figure 1: Overall methodology used to demonstrate that in RAG-based settings, adversarial databases counterintuitively can improve the success of correctly answering domain specific MCQ for specific LLMs.
Figure 2: Example of possible parse tree to automatically extract answer choice from the LLM output. After pattern matching of the introductory phrase and explanatory phrase, the automated script can easily output which answer choice is chosen A-E.
Figure 3: Visualization of DistilBERT attention outputs given both a MCQ prompt and also a Bible or Random Words + MCQ prompt. An evident difference in the weighting matrix is demonstrated.

Adversarial Databases Improve Success in Retrieval-based Large Language Models

TL;DR

Abstract

Adversarial Databases Improve Success in Retrieval-based Large Language Models

Authors

TL;DR

Abstract

Table of Contents

Figures (3)