Table of Contents
Fetching ...

Knowledge Retrieval Based on Generative AI

Te-Lun Yang, Jyi-Shane Liu, Yuen-Hsien Tseng, Jyh-Shing Roger Jang

TL;DR

This work investigates a privacy-preserving knowledge retrieval system built on Retrieval-Augmented Generation (RAG) that leverages dense vector retrieval and a cross-encoder reranker. By pairing the BGE-M3 embedding model with a Lawbank-based, domain-specific data source and Chinese Wikipedia, the approach enhances LLM accuracy on knowledge-intensive tasks while enabling local deployment to protect privacy. Evaluations on TTQA and TMMLU+ show that RAG substantially improves many models, though gains vary by domain and model architecture; finance-domain results particularly benefit from Lawbank, and prompt refinements can yield further improvements. The study demonstrates the practical viability of domain-focused, privacy-conscious QA systems and outlines future directions for broader data sources, efficiency optimizations, and user-centered design.

Abstract

This study develops a question-answering system based on Retrieval-Augmented Generation (RAG) using Chinese Wikipedia and Lawbank as retrieval sources. Using TTQA and TMMLU+ as evaluation datasets, the system employs BGE-M3 for dense vector retrieval to obtain highly relevant search results and BGE-reranker to reorder these results based on query relevance. The most pertinent retrieval outcomes serve as reference knowledge for a Large Language Model (LLM), enhancing its ability to answer questions and establishing a knowledge retrieval system grounded in generative AI. The system's effectiveness is assessed through a two-stage evaluation: automatic and assisted performance evaluations. The automatic evaluation calculates accuracy by comparing the model's auto-generated labels with ground truth answers, measuring performance under standardized conditions without human intervention. The assisted performance evaluation involves 20 finance-related multiple-choice questions answered by 20 participants without financial backgrounds. Initially, participants answer independently. Later, they receive system-generated reference information to assist in answering, examining whether the system improves accuracy when assistance is provided. The main contributions of this research are: (1) Enhanced LLM Capability: By integrating BGE-M3 and BGE-reranker, the system retrieves and reorders highly relevant results, reduces hallucinations, and dynamically accesses authorized or public knowledge sources. (2) Improved Data Privacy: A customized RAG architecture enables local operation of the LLM, eliminating the need to send private data to external servers. This approach enhances data security, reduces reliance on commercial services, lowers operational costs, and mitigates privacy risks.

Knowledge Retrieval Based on Generative AI

TL;DR

This work investigates a privacy-preserving knowledge retrieval system built on Retrieval-Augmented Generation (RAG) that leverages dense vector retrieval and a cross-encoder reranker. By pairing the BGE-M3 embedding model with a Lawbank-based, domain-specific data source and Chinese Wikipedia, the approach enhances LLM accuracy on knowledge-intensive tasks while enabling local deployment to protect privacy. Evaluations on TTQA and TMMLU+ show that RAG substantially improves many models, though gains vary by domain and model architecture; finance-domain results particularly benefit from Lawbank, and prompt refinements can yield further improvements. The study demonstrates the practical viability of domain-focused, privacy-conscious QA systems and outlines future directions for broader data sources, efficiency optimizations, and user-centered design.

Abstract

This study develops a question-answering system based on Retrieval-Augmented Generation (RAG) using Chinese Wikipedia and Lawbank as retrieval sources. Using TTQA and TMMLU+ as evaluation datasets, the system employs BGE-M3 for dense vector retrieval to obtain highly relevant search results and BGE-reranker to reorder these results based on query relevance. The most pertinent retrieval outcomes serve as reference knowledge for a Large Language Model (LLM), enhancing its ability to answer questions and establishing a knowledge retrieval system grounded in generative AI. The system's effectiveness is assessed through a two-stage evaluation: automatic and assisted performance evaluations. The automatic evaluation calculates accuracy by comparing the model's auto-generated labels with ground truth answers, measuring performance under standardized conditions without human intervention. The assisted performance evaluation involves 20 finance-related multiple-choice questions answered by 20 participants without financial backgrounds. Initially, participants answer independently. Later, they receive system-generated reference information to assist in answering, examining whether the system improves accuracy when assistance is provided. The main contributions of this research are: (1) Enhanced LLM Capability: By integrating BGE-M3 and BGE-reranker, the system retrieves and reorders highly relevant results, reduces hallucinations, and dynamically accesses authorized or public knowledge sources. (2) Improved Data Privacy: A customized RAG architecture enables local operation of the LLM, eliminating the need to send private data to external servers. This approach enhances data security, reduces reliance on commercial services, lowers operational costs, and mitigates privacy risks.
Paper Structure (19 sections, 12 figures, 1 table)

This paper contains 19 sections, 12 figures, 1 table.

Figures (12)

  • Figure 1: Basic Flow for this study
  • Figure 2: Bi-Encoder and Cross-Encoder
  • Figure 3: Performance (Benchmark: TTQA)
  • Figure 4: Performance for TAIDE-LX-7B-Chat
  • Figure 5: Performance for Llama-2-13b-chat-hf
  • ...and 7 more figures