Table of Contents
Fetching ...

Sentiment Classification of Thai Central Bank Press Releases Using Supervised Learning

Stefano Grassi

TL;DR

This study tackles sentiment classification of central bank communications by applying supervised learning to the Bank of Thailand's English MPC press releases. It demonstrates that Naive Bayes, SVM, and Random Forest can classify sentiment on a modest, labeled English-language corpus using TF-IDF features with 1–3 gram ranges. Naive Bayes achieves the strongest macro-F1 around 0.75, suggesting simple models can outperform more complex ones on small datasets, while larger datasets could unlock gains for SVM and RF. The work highlights practical avenues for extending sentiment analysis to Thai texts and larger corpora, and discusses labeling bias, data accessibility, and the complementary role of supervised methods alongside dictionary-based approaches.

Abstract

Central bank communication plays a critical role in shaping economic expectations and monetary policy effectiveness. This study applies supervised machine learning techniques to classify the sentiment of press releases from the Bank of Thailand, addressing gaps in research that primarily focus on lexicon-based approaches. My findings show that supervised learning can be an effective method, even with smaller datasets, and serves as a starting point for further automation. However, achieving higher accuracy and better generalization requires a substantial amount of labeled data, which is time-consuming and demands expertise. Using models such as Naïve Bayes, Random Forest and SVM, this study demonstrates the applicability of machine learning for central bank sentiment analysis, with English-language communications from the Thai Central Bank as a case study.

Sentiment Classification of Thai Central Bank Press Releases Using Supervised Learning

TL;DR

This study tackles sentiment classification of central bank communications by applying supervised learning to the Bank of Thailand's English MPC press releases. It demonstrates that Naive Bayes, SVM, and Random Forest can classify sentiment on a modest, labeled English-language corpus using TF-IDF features with 1–3 gram ranges. Naive Bayes achieves the strongest macro-F1 around 0.75, suggesting simple models can outperform more complex ones on small datasets, while larger datasets could unlock gains for SVM and RF. The work highlights practical avenues for extending sentiment analysis to Thai texts and larger corpora, and discusses labeling bias, data accessibility, and the complementary role of supervised methods alongside dictionary-based approaches.

Abstract

Central bank communication plays a critical role in shaping economic expectations and monetary policy effectiveness. This study applies supervised machine learning techniques to classify the sentiment of press releases from the Bank of Thailand, addressing gaps in research that primarily focus on lexicon-based approaches. My findings show that supervised learning can be an effective method, even with smaller datasets, and serves as a starting point for further automation. However, achieving higher accuracy and better generalization requires a substantial amount of labeled data, which is time-consuming and demands expertise. Using models such as Naïve Bayes, Random Forest and SVM, this study demonstrates the applicability of machine learning for central bank sentiment analysis, with English-language communications from the Thai Central Bank as a case study.

Paper Structure

This paper contains 20 sections.