Table of Contents
Fetching ...

WeaverBird: Empowering Financial Decision-Making with Large Language Model, Knowledge Base, and Search Engine

Siqiao Xue, Fan Zhou, Yi Xu, Ming Jin, Qingsong Wen, Hongyan Hao, Qingyang Dai, Caigao Jiang, Hongyu Zhao, Shuo Xie, Jianshan He, James Zhang, Hongyuan Mei

TL;DR

WeaverBird addresses the challenge of accessible, credible financial guidance by integrating a finance-tuned LLM with a local knowledge base and web search. It employs an efficiency-aware retrieval architecture, bilingual encoders trained via a contrastive objective, and a temporally grounded prompt formulation to deliver up-to-date, cited responses. Through extensive data collection, multi-source retrieval, and LoRA-based fine-tuning, WeaverBird achieves superior retrieval and response quality compared with several baselines, while offering scalable deployment and open-source resources. The work demonstrates the viability of retrieval-augmented, finance-domain AI assistants and outlines practical steps for extending to other domains and dialog capabilities.

Abstract

We present WeaverBird, an intelligent dialogue system designed specifically for the finance domain. Our system harnesses a large language model of GPT architecture that has been tuned using extensive corpora of finance-related text. As a result, our system possesses the capability to understand complex financial queries, such as "How should I manage my investments during inflation?", and provide informed responses. Furthermore, our system incorporates a local knowledge base and a search engine to retrieve relevant information. The final responses are conditioned on the search results and include proper citations to the sources, thus enjoying an enhanced credibility. Through a range of finance-related questions, we have demonstrated the superior performance of our system compared to other models. To experience our system firsthand, users can interact with our live demo at https://weaverbird.ttic.edu, as well as watch our 2-min video illustration at https://www.youtube.com/watch?v=yofgeqnlrMc.

WeaverBird: Empowering Financial Decision-Making with Large Language Model, Knowledge Base, and Search Engine

TL;DR

WeaverBird addresses the challenge of accessible, credible financial guidance by integrating a finance-tuned LLM with a local knowledge base and web search. It employs an efficiency-aware retrieval architecture, bilingual encoders trained via a contrastive objective, and a temporally grounded prompt formulation to deliver up-to-date, cited responses. Through extensive data collection, multi-source retrieval, and LoRA-based fine-tuning, WeaverBird achieves superior retrieval and response quality compared with several baselines, while offering scalable deployment and open-source resources. The work demonstrates the viability of retrieval-augmented, finance-domain AI assistants and outlines practical steps for extending to other domains and dialog capabilities.

Abstract

We present WeaverBird, an intelligent dialogue system designed specifically for the finance domain. Our system harnesses a large language model of GPT architecture that has been tuned using extensive corpora of finance-related text. As a result, our system possesses the capability to understand complex financial queries, such as "How should I manage my investments during inflation?", and provide informed responses. Furthermore, our system incorporates a local knowledge base and a search engine to retrieve relevant information. The final responses are conditioned on the search results and include proper citations to the sources, thus enjoying an enhanced credibility. Through a range of finance-related questions, we have demonstrated the superior performance of our system compared to other models. To experience our system firsthand, users can interact with our live demo at https://weaverbird.ttic.edu, as well as watch our 2-min video illustration at https://www.youtube.com/watch?v=yofgeqnlrMc.
Paper Structure (45 sections, 1 equation, 6 figures, 4 tables)

This paper contains 45 sections, 1 equation, 6 figures, 4 tables.

Figures (6)

  • Figure 1: An illustration of WeaverBird that answers a financial query by intelligent search and generation.
  • Figure 2: Retrieval performance of all combinations of encoders and similarity scores. From left to right, they are: pretrained ME5 with cosine similarity, trained ME5 with cosine similarity, trained M3E with dot product, trained ME5 with Euclidean distance, trained Contriver with Euclidean distance, trained Contriver with dot product, and trained Contriver with cosine. Pre-trained Contriver is not presented since its performance is very poor.
  • Figure 3: Response quality performance of all methods. From left to right, they are: WebGLM, FinGPT, FinChat, WeaverBird, WeaverBird without knowledge base, WeaverBird without search engine and WeaverBird with neither knowledge base nor search engine.
  • Figure 4: The effect of document retrieval accuracy on the response quality of the WeaverBird system.
  • Figure 5: The main interface of WeaverBird: the configuration and chatbox.
  • ...and 1 more figures

Theorems & Definitions (2)

  • Example 1: Chinese
  • Example 2: English