Table of Contents
Fetching ...

Multi-Document Financial Question Answering using LLMs

Shalin Shah, Srikanth Ryali, Ramasubbu Venkatesh

TL;DR

Two new methods for multi-document financial question answering that uses semantic tagging, and then, queries the index to get the context (RAG_SEM) and KG_RAG, which outperforms plain RAG significantly and outperforms RAG_SEM in four out of nine metrics.

Abstract

We propose two new methods for multi-document financial question answering. First, a method that uses semantic tagging, and then, queries the index to get the context (RAG_SEM). And second, a Knowledge Graph (KG_RAG) based method that uses semantic tagging, and, retrieves knowledge graph triples from a graph database, as context. KG_RAG uses knowledge graphs constructed using a small model that is fine-tuned using knowledge distillation using a large teacher model. The data consists of 18 10K reports of Apple, Microsoft, Alphabet, NVIDIA, Amazon and Tesla for the years 2021, 2022 and 2023. The list of questions in the data consists of 111 complex questions including many esoteric questions that are difficult to answer and the answers are not completely obvious. As evaluation metrics, we use overall scores as well as segmented scores for measurement including the faithfulness, relevance, correctness, similarity, an LLM based overall score and the rouge scores as well as a similarity of embeddings. We find that both methods outperform plain RAG significantly. KG_RAG outperforms RAG_SEM in four out of nine metrics.

Multi-Document Financial Question Answering using LLMs

TL;DR

Two new methods for multi-document financial question answering that uses semantic tagging, and then, queries the index to get the context (RAG_SEM) and KG_RAG, which outperforms plain RAG significantly and outperforms RAG_SEM in four out of nine metrics.

Abstract

We propose two new methods for multi-document financial question answering. First, a method that uses semantic tagging, and then, queries the index to get the context (RAG_SEM). And second, a Knowledge Graph (KG_RAG) based method that uses semantic tagging, and, retrieves knowledge graph triples from a graph database, as context. KG_RAG uses knowledge graphs constructed using a small model that is fine-tuned using knowledge distillation using a large teacher model. The data consists of 18 10K reports of Apple, Microsoft, Alphabet, NVIDIA, Amazon and Tesla for the years 2021, 2022 and 2023. The list of questions in the data consists of 111 complex questions including many esoteric questions that are difficult to answer and the answers are not completely obvious. As evaluation metrics, we use overall scores as well as segmented scores for measurement including the faithfulness, relevance, correctness, similarity, an LLM based overall score and the rouge scores as well as a similarity of embeddings. We find that both methods outperform plain RAG significantly. KG_RAG outperforms RAG_SEM in four out of nine metrics.

Paper Structure

This paper contains 11 sections, 6 figures, 1 table, 3 algorithms.

Figures (6)

  • Figure 1: Segmented Comparison of the Three Methods of Financial Question Answering
  • Figure 2: Comparison of the Three Methods of Financial Question Answering
  • Figure 3: RAG: Plain Retrieval Augmented Generation
  • Figure 4: RAG_SEM: RAG with Semantic Tagging
  • Figure 5: KG_RAG: Knowledge Graph RAG with Semantic Tagging
  • ...and 1 more figures