Table of Contents
Fetching ...

Understanding how social discussion platforms like Reddit are influencing financial behavior

Sachin Thukral, Suyash Sangwan, Arnab Chatterjee, Lipika Dey, Aaditya Agrawal, Pramit Kumar Chandra, Animesh Mukherjee

TL;DR

This paper addresses how social discussion platforms influence financial behavior by applying NLP and social network analysis to Reddit finance discussions. It advances topic modeling with a skewness-based criterion to select the optimal topic count and maps content to behavioral finance components, enabling interpretation of user concerns, risk attitudes, and knowledge needs. It introduces metrics for participation, reach, and influence, and uses a directed interaction graph to reveal influencer communities and trust dynamics, highlighting the role of comments in information diffusion. The approach demonstrates generalizability to other platforms and domains and sets the stage for deeper analyses of financial behaviors and knowledge dissemination in online communities.

Abstract

This study proposes content and interaction analysis techniques for a large repository created from social media content. Though we have presented our study for a large platform dedicated to discussions around financial topics, the proposed methods are generic and applicable to all platforms. Along with an extension of topic extraction method using Latent Dirichlet Allocation, we propose a few measures to assess user participation, influence and topic affinities specifically. Our study also maps user-generated content to components of behavioral finance. While these types of information are usually gathered through surveys, it is obvious that large scale data analysis from social media can reveal many potentially unknown or rare insights. Characterising users based on their platform behavior to provide critical insights about how communities are formed and trust is established in these platforms using graphical analysis is also studied.

Understanding how social discussion platforms like Reddit are influencing financial behavior

TL;DR

This paper addresses how social discussion platforms influence financial behavior by applying NLP and social network analysis to Reddit finance discussions. It advances topic modeling with a skewness-based criterion to select the optimal topic count and maps content to behavioral finance components, enabling interpretation of user concerns, risk attitudes, and knowledge needs. It introduces metrics for participation, reach, and influence, and uses a directed interaction graph to reveal influencer communities and trust dynamics, highlighting the role of comments in information diffusion. The approach demonstrates generalizability to other platforms and domains and sets the stage for deeper analyses of financial behaviors and knowledge dissemination in online communities.

Abstract

This study proposes content and interaction analysis techniques for a large repository created from social media content. Though we have presented our study for a large platform dedicated to discussions around financial topics, the proposed methods are generic and applicable to all platforms. Along with an extension of topic extraction method using Latent Dirichlet Allocation, we propose a few measures to assess user participation, influence and topic affinities specifically. Our study also maps user-generated content to components of behavioral finance. While these types of information are usually gathered through surveys, it is obvious that large scale data analysis from social media can reveal many potentially unknown or rare insights. Characterising users based on their platform behavior to provide critical insights about how communities are formed and trust is established in these platforms using graphical analysis is also studied.
Paper Structure (10 sections, 5 equations, 8 figures, 3 tables, 1 algorithm)

This paper contains 10 sections, 5 equations, 8 figures, 3 tables, 1 algorithm.

Figures (8)

  • Figure 1: Posts score vs. maximum score a comment received on the post [Both axes are log10 scaled]
  • Figure 2: Comment Score vs. Response time(s) [Both axes are log10 scaled]
  • Figure 3: Topic Co-occurences
  • Figure 4: Monthwise Topic Distribution: Here for each topic we have bars from the July 2020 to June 2021.
  • Figure 5: Topicwise Presence of PCA Components
  • ...and 3 more figures