Disentangling the sources of cyber risk premia

Loïc Maréchal; Nathan Monnet

Disentangling the sources of cyber risk premia

Loïc Maréchal, Nathan Monnet

TL;DR

This paper develops a doc2vec-based framework to quantify firm-level cyber risk from 10-K disclosures using the MITRE ATT&CK knowledge base. It constructs four cyber-type scores, an aggregate cyber score, and a cyber sentiment score, applying them to a panel of about 7,000 US firms from 2007–2023. The results show positive risk premia for cyber-based portfolios and robust pricing performance across univariate, cross-sectional, and Bayesian asset-pricing tests, while suggesting the market views cyber risk as a single aggregated risk rather than distinguishing among cyber-attack types. The findings advance the integration of textual data and structured cyber knowledge into asset pricing, with potential implications for risk management, disclosure strategy, and investment allocation in cybersecurity-related exposures.

Abstract

We use a methodology based on a machine learning algorithm to quantify firms' cyber risks based on their disclosures and a dedicated cyber corpus. The model can identify paragraphs related to determined cyber-threat types and accordingly attribute several related cyber scores to the firm. The cyber scores are unrelated to other firms' characteristics. Stocks with high cyber scores significantly outperform other stocks. The long-short cyber risk factors have positive risk premia, are robust to all factors' benchmarks, and help price returns. Furthermore, we suggest the market does not distinguish between different types of cyber risks but instead views them as a single, aggregate cyber risk.

Disentangling the sources of cyber risk premia

TL;DR

Abstract

Paper Structure (33 sections, 17 equations, 24 figures, 34 tables)

This paper contains 33 sections, 17 equations, 24 figures, 34 tables.

Introduction
Literature review
Sentiment analysis and text classification
Vector representation of paragraphs and topics clustering
Cyber risk and expected stock returns
Data and methodology
Market data
10-K statements
MITRE ATT&CK description
Cyber score
Preprocessing
Paragraph Vector algorithm (doc2vec)
Cosine similarity
Cyber tactics clustering
Setting the cyber score
...and 18 more sections

Figures (24)

Figure 1: Industry distribution
Figure 2: Number of 10-Ks per year
Figure 3: Structure of MITRE ATT&CK
Figure 4: Illustration of doc2vec training
Figure 5: Clustering results part.1
...and 19 more figures

Disentangling the sources of cyber risk premia

TL;DR

Abstract

Disentangling the sources of cyber risk premia

Authors

TL;DR

Abstract

Table of Contents

Figures (24)