Identifying Financial Risk Information Using RAG with a Contrastive Insight
Ali Elahi
TL;DR
The paper tackles the problem that retrieval-augmented generation in finance often yields generic risk signals. It introduces a peer-aware contrastive inference layer on top of RAG that retrieves broad risk information for a target firm and its peers and then contrasts them to surface distinctive risks. Experimental results show the contrastive approach improves ROUGE and BERTScore compared with baselines, with an O3 model achieving the strongest performance. This work enhances practical equity research by delivering context-sensitive, comparative risk insights that better align with expert investment theses.
Abstract
In specialized domains, humans often compare new problems against similar examples, highlight nuances, and draw conclusions instead of analyzing information in isolation. When applying reasoning in specialized contexts with LLMs on top of a RAG, the pipeline can capture contextually relevant information, but it is not designed to retrieve comparable cases or related problems. While RAG is effective at extracting factual information, its outputs in specialized reasoning tasks often remain generic, reflecting broad facts rather than context-specific insights. In finance, it results in generic risks that are true for the majority of companies. To address this limitation, we propose a peer-aware comparative inference layer on top of RAG. Our contrastive approach outperforms baseline RAG in text generation metrics such as ROUGE and BERTScore in comparison with human-generated equity research and risk.
