Table of Contents
Fetching ...

Redefining Information Retrieval of Structured Database via Large Language Models

Mingzhu Wang, Yuzhe Zhang, Qihang Zhao, Junyi Yang, Hong Zhang

TL;DR

An LLM-based search and question answering system tailored for the financial domain is constructed by fine-tuning LLM on two tasks including Text2API and API-ID recognition by employing the powerful semantic understanding ability of Large Language Models as retrievers.

Abstract

Retrieval augmentation is critical when Language Models (LMs) exploit non-parametric knowledge related to the query through external knowledge bases before reasoning. The retrieved information is incorporated into LMs as context alongside the query, enhancing the reliability of responses towards factual questions. Prior researches in retrieval augmentation typically follow a retriever-generator paradigm. In this context, traditional retrievers encounter challenges in precisely and seamlessly extracting query-relevant information from knowledge bases. To address this issue, this paper introduces a novel retrieval augmentation framework called ChatLR that primarily employs the powerful semantic understanding ability of Large Language Models (LLMs) as retrievers to achieve precise and concise information retrieval. Additionally, we construct an LLM-based search and question answering system tailored for the financial domain by fine-tuning LLM on two tasks including Text2API and API-ID recognition. Experimental results demonstrate the effectiveness of ChatLR in addressing user queries, achieving an overall information retrieval accuracy exceeding 98.8\%.

Redefining Information Retrieval of Structured Database via Large Language Models

TL;DR

An LLM-based search and question answering system tailored for the financial domain is constructed by fine-tuning LLM on two tasks including Text2API and API-ID recognition by employing the powerful semantic understanding ability of Large Language Models as retrievers.

Abstract

Retrieval augmentation is critical when Language Models (LMs) exploit non-parametric knowledge related to the query through external knowledge bases before reasoning. The retrieved information is incorporated into LMs as context alongside the query, enhancing the reliability of responses towards factual questions. Prior researches in retrieval augmentation typically follow a retriever-generator paradigm. In this context, traditional retrievers encounter challenges in precisely and seamlessly extracting query-relevant information from knowledge bases. To address this issue, this paper introduces a novel retrieval augmentation framework called ChatLR that primarily employs the powerful semantic understanding ability of Large Language Models (LLMs) as retrievers to achieve precise and concise information retrieval. Additionally, we construct an LLM-based search and question answering system tailored for the financial domain by fine-tuning LLM on two tasks including Text2API and API-ID recognition. Experimental results demonstrate the effectiveness of ChatLR in addressing user queries, achieving an overall information retrieval accuracy exceeding 98.8\%.
Paper Structure (18 sections, 9 figures, 4 tables)

This paper contains 18 sections, 9 figures, 4 tables.

Figures (9)

  • Figure 1: Performance of ChatLR (Ours).
  • Figure 2: An example of ChatLR. ChatLR facilitates the retrieval of relevant information within a database by transforming queries (e.g., "What’s Company XXX’s net profit for 2020?") into search command statements, such as API search commands or SQL queries.
  • Figure 3: An example of data generation process.
  • Figure 4: A Retrieval System for Structured Databases. ChatLR undertakes two distinct tasks. Initially, it engages in API-ID recognition, parsing the user's query to identify the specific API-ID. Subsequently, leveraging the API-Info list, it retrieves detailed information about the identified API, including input and output argument names. With the acquired API information, ChatLR proceeds to execute the Text2API task, generating the precise API search command statement. Finally, the search command is employed to retrieve accurate results from the database, which are then presented to the user.
  • Figure 5: Structured database API search framework. Users select the appropriate API-ID based on the functional descriptions of various APIs, inputting all required argument values for the chosen API. Subsequently, a database query is performed to obtain comprehensive tabular data.
  • ...and 4 more figures