RAG-IT: Retrieval-Augmented Instruction Tuning for Automated Financial Analysis -- A Case Study for the Semiconductor Sector
Hai-Thien To, Tien-Cuong Bui, Van-Duc Le
TL;DR
RAG-IT tackles automated earnings-analysis for the semiconductor sector by combining retrieval-grounded data generation with instruction-tuned language models. The approach builds a two-tier financial instruction dataset and uses retrieval-augmented fine-tuning (via LoRA and 4-bit QLoRA on Llama-2-7B) to produce contextually grounded, analyst-style outputs; a vector-based retriever anchors responses to authentic documents. Experimental results show the financially augmented model outperforms a general open-source baseline and approaches the performance of GPT-3.5 on earnings-report tasks, with qualitative analyses highlighting improved grounding and numeric reliability. This work demonstrates a scalable path to domain-adapted, explainable financial analysis that can reduce reliance on expensive proprietary models and expedite automated reporting in finance.
Abstract
Financial analysis relies heavily on the interpretation of earnings reports to assess company performance and guide decision-making. Traditional methods for generating such analyzes require significant financial expertise and are often time-consuming. With the rapid advancement of Large Language Models (LLMs), domain-specific adaptations have emerged for financial tasks such as sentiment analysis and entity recognition. This paper introduces RAG-IT (Retrieval-Augmented Instruction Tuning), a novel framework designed to automate the generation of earnings report analysis through an LLM fine-tuned specifically for the financial domain. Our approach integrates retrieval augmentation with instruction-based fine-tuning to enhance factual accuracy, contextual relevance, and domain adaptability. We construct a sector-specific financial instruction dataset derived from semiconductor industry documents to guide the LLM adaptation to specialized financial reasoning. Using NVIDIA, AMD, and Broadcom as representative companies, our case study demonstrates that RAG-IT substantially improves a general-purpose open-source LLM and achieves performance comparable to commercial systems like GPT-3.5 on financial report generation tasks. This research highlights the potential of retrieval-augmented instruction tuning to streamline and elevate financial analysis automation, advancing the broader field of intelligent financial reporting.
