Enhancing Q&A with Domain-Specific Fine-Tuning and Iterative Reasoning: A Comparative Study
Zooey Nguyen, Anthony Annunziata, Vinh Luong, Sang Dinh, Quynh Le, Anh Hai Ha, Chanh Le, Hong An Phan, Shruti Raghavan, Christopher Nguyen
TL;DR
This study addresses the challenge of domain-specific Q&A with large language models by examining two levers: domain-specific fine-tuning and iterative reasoning. Using the FinanceBench SEC financial filings dataset, the authors quantify that fine-tuning embedding models for indexing and retrieval yields meaningful accuracy gains, often outperforming fine-tuning the generator. Introducing the OODA reasoning loop on top of retrieval-augmented generation delivers the largest performance improvements, bringing QA outputs closer to human-expert quality. The work culminates in a structured technical design space to guide practical AI-system decisions and lays out actionable recommendations for deploying high-precision, domain-aware Q&A systems in finance and beyond.
Abstract
This paper investigates the impact of domain-specific model fine-tuning and of reasoning mechanisms on the performance of question-answering (Q&A) systems powered by large language models (LLMs) and Retrieval-Augmented Generation (RAG). Using the FinanceBench SEC financial filings dataset, we observe that, for RAG, combining a fine-tuned embedding model with a fine-tuned LLM achieves better accuracy than generic models, with relatively greater gains attributable to fine-tuned embedding models. Additionally, employing reasoning iterations on top of RAG delivers an even bigger jump in performance, enabling the Q&A systems to get closer to human-expert quality. We discuss the implications of such findings, propose a structured technical design space capturing major technical components of Q&A AI, and provide recommendations for making high-impact technical choices for such components. We plan to follow up on this work with actionable guides for AI teams and further investigations into the impact of domain-specific augmentation in RAG and into agentic AI capabilities such as advanced planning and reasoning.
