Table of Contents
Fetching ...

AMA-LSTM: Pioneering Robust and Fair Financial Audio Analysis for Stock Volatility Prediction

Shengkun Wang, Taoran Ji, Jianfeng He, Mariam Almutairi, Dan Wang, Linhan Wang, Min Zhang, Chang-Tien Lu

TL;DR

This work tackles robustness and fairness in stock volatility prediction from earnings-call audio and transcripts by introducing AMA-LSTM, a multimodal attentive LSTM trained with input-space adversarial perturbations. The method jointly processes audio and text through unimodal BiLSTMs, fuses them via attention, and predicts horizon-specific volatility while optimizing a robust min-max objective to resist stochastic market noise and gender-based biases. Empirical results on two real-world datasets show AMA-LSTM achieves state-of-the-art MSE and reduced gender bias (lower $\Delta MSE$) compared with strong baselines, outperforming random perturbation and non-adversarial variants. The findings highlight the value of adversarial training for improving robustness and fairness in financial deep learning, with practical impact for more reliable volatility forecasting using public earnings-call data.

Abstract

Stock volatility prediction is an important task in the financial industry. Recent advancements in multimodal methodologies, which integrate both textual and auditory data, have demonstrated significant improvements in this domain, such as earnings calls (Earnings calls are public available and often involve the management team of a public company and interested parties to discuss the company's earnings). However, these multimodal methods have faced two drawbacks. First, they often fail to yield reliable models and overfit the data due to their absorption of stochastic information from the stock market. Moreover, using multimodal models to predict stock volatility suffers from gender bias and lacks an efficient way to eliminate such bias. To address these aforementioned problems, we use adversarial training to generate perturbations that simulate the inherent stochasticity and bias, by creating areas resistant to random information around the input space to improve model robustness and fairness. Our comprehensive experiments on two real-world financial audio datasets reveal that this method exceeds the performance of current state-of-the-art solution. This confirms the value of adversarial training in reducing stochasticity and bias for stock volatility prediction tasks.

AMA-LSTM: Pioneering Robust and Fair Financial Audio Analysis for Stock Volatility Prediction

TL;DR

This work tackles robustness and fairness in stock volatility prediction from earnings-call audio and transcripts by introducing AMA-LSTM, a multimodal attentive LSTM trained with input-space adversarial perturbations. The method jointly processes audio and text through unimodal BiLSTMs, fuses them via attention, and predicts horizon-specific volatility while optimizing a robust min-max objective to resist stochastic market noise and gender-based biases. Empirical results on two real-world datasets show AMA-LSTM achieves state-of-the-art MSE and reduced gender bias (lower ) compared with strong baselines, outperforming random perturbation and non-adversarial variants. The findings highlight the value of adversarial training for improving robustness and fairness in financial deep learning, with practical impact for more reliable volatility forecasting using public earnings-call data.

Abstract

Stock volatility prediction is an important task in the financial industry. Recent advancements in multimodal methodologies, which integrate both textual and auditory data, have demonstrated significant improvements in this domain, such as earnings calls (Earnings calls are public available and often involve the management team of a public company and interested parties to discuss the company's earnings). However, these multimodal methods have faced two drawbacks. First, they often fail to yield reliable models and overfit the data due to their absorption of stochastic information from the stock market. Moreover, using multimodal models to predict stock volatility suffers from gender bias and lacks an efficient way to eliminate such bias. To address these aforementioned problems, we use adversarial training to generate perturbations that simulate the inherent stochasticity and bias, by creating areas resistant to random information around the input space to improve model robustness and fairness. Our comprehensive experiments on two real-world financial audio datasets reveal that this method exceeds the performance of current state-of-the-art solution. This confirms the value of adversarial training in reducing stochasticity and bias for stock volatility prediction tasks.
Paper Structure (12 sections, 6 equations, 3 figures, 4 tables)

This paper contains 12 sections, 6 equations, 3 figures, 4 tables.

Figures (3)

  • Figure 1: (a) displays the percentage of woman CEO in recent years, and (b) compares the proportion of female to male CEO within the two datasets utilized.
  • Figure 2: Illustration of the adversarial multimodal attentive LSTM architecture and an attentive BiLSTM block.
  • Figure 3: Illustration of the AMA-LSTM adversarial training process. Perturbations ($\delta$) are derived by computing the gradients of the token embeddings in relation to the loss function.