Table of Contents
Fetching ...

Distilled ChatGPT Topic & Sentiment Modeling with Applications in Finance

Olivier Gandouet, Mouloud Belbahri, Armelle Jezequel, Yuriy Bodjov

TL;DR

This work addresses the challenge of extracting interpretable signals from vast earnings call transcripts by marrying large-language-model guidance with knowledge distillation to produce compact topic and sentiment classifiers. The authors build a lightweight topic model using MPNet with an MLP head and a sentiment model via a two-stage distillation that leverages a FinBERT teacher and ChatGPT-derived data, achieving competitive performance (approximately $78\%$ F1 on expert data and up to $83\%$ on benchmarks). They validate the approach on S&P 1500 data (2010–2023), showing topic propensity and net sentiment can correlate with sector-neutral returns, though effects are topic-dependent and require careful topic differentiation. The framework enables efficient, deployable analysis suitable for quantitative investing, and the authors propose extensions to handle multi-topic sentences, sentence proximity, and interactive teacher–student refinement to adapt to changing market conditions.

Abstract

In this study, ChatGPT is utilized to create streamlined models that generate easily interpretable features. These features are then used to evaluate financial outcomes from earnings calls. We detail a training approach that merges knowledge distillation and transfer learning, resulting in lightweight topic and sentiment classification models without significant loss in accuracy. These models are assessed through a dataset annotated by experts. The paper also delves into two practical case studies, highlighting how the generated features can be effectively utilized in quantitative investing scenarios.

Distilled ChatGPT Topic & Sentiment Modeling with Applications in Finance

TL;DR

This work addresses the challenge of extracting interpretable signals from vast earnings call transcripts by marrying large-language-model guidance with knowledge distillation to produce compact topic and sentiment classifiers. The authors build a lightweight topic model using MPNet with an MLP head and a sentiment model via a two-stage distillation that leverages a FinBERT teacher and ChatGPT-derived data, achieving competitive performance (approximately F1 on expert data and up to on benchmarks). They validate the approach on S&P 1500 data (2010–2023), showing topic propensity and net sentiment can correlate with sector-neutral returns, though effects are topic-dependent and require careful topic differentiation. The framework enables efficient, deployable analysis suitable for quantitative investing, and the authors propose extensions to handle multi-topic sentences, sentence proximity, and interactive teacher–student refinement to adapt to changing market conditions.

Abstract

In this study, ChatGPT is utilized to create streamlined models that generate easily interpretable features. These features are then used to evaluate financial outcomes from earnings calls. We detail a training approach that merges knowledge distillation and transfer learning, resulting in lightweight topic and sentiment classification models without significant loss in accuracy. These models are assessed through a dataset annotated by experts. The paper also delves into two practical case studies, highlighting how the generated features can be effectively utilized in quantitative investing scenarios.
Paper Structure (13 sections, 3 equations, 8 figures, 4 tables)

This paper contains 13 sections, 3 equations, 8 figures, 4 tables.

Figures (8)

  • Figure 1: Earning Calls Topic Classification Pipeline.
  • Figure 2: Examples of sentences labeled by Chat GPT.
  • Figure 3: Identified topics distribution and average sentiment per topic on the labeled sentences dataset.
  • Figure 4: Topic Classification Student Model Architecture.
  • Figure 5: Sentiment Classification Model Pipeline.
  • ...and 3 more figures