Table of Contents
Fetching ...

Medfluencer: A Network Representation of Medical Influencers' Identities and Discourse on Social Media

Zhijin Guo, Edwin Simpson, Roberta Bernardi

TL;DR

The paper addresses how medical influencers shape public health discourse on social media by proposing a socio-semantic network framework that jointly models actor identities and discursive frames. It combines a few-shot, prompt-based, multi-label identity classifier with BERTopic and LDA topic modeling to map identities, network ties, and frames onto a unified network, using a dataset of the top-100 medical influencers on X during the COVID-19 era. The study finds that BERTopic yields more coherent frame representations than LDA and demonstrates meaningful cross-identity interactions and message propagation within the network, offering actionable insights for public health campaigns. Code is released to support reproducibility and further research in health communication analytics.

Abstract

In our study, we first constructed a dataset from the tweets of the top 100 medical influencers with the highest Influencer Score during the COVID-19 pandemic. This dataset was then used to construct a socio-semantic network, mapping both their identities and key topics, which are crucial for understanding their impact on public health discourse. To achieve this, we developed a few-shot multi-label classifier to identify influencers and their network actors' identities, employed BERTopic for extracting thematic content, and integrated these components into a network model to analyze their impact on health discourse. To ensure the reproducibility of our results, we have made the code available at https://github.com/ZhijinGuo/Medinfluencer.

Medfluencer: A Network Representation of Medical Influencers' Identities and Discourse on Social Media

TL;DR

The paper addresses how medical influencers shape public health discourse on social media by proposing a socio-semantic network framework that jointly models actor identities and discursive frames. It combines a few-shot, prompt-based, multi-label identity classifier with BERTopic and LDA topic modeling to map identities, network ties, and frames onto a unified network, using a dataset of the top-100 medical influencers on X during the COVID-19 era. The study finds that BERTopic yields more coherent frame representations than LDA and demonstrates meaningful cross-identity interactions and message propagation within the network, offering actionable insights for public health campaigns. Code is released to support reproducibility and further research in health communication analytics.

Abstract

In our study, we first constructed a dataset from the tweets of the top 100 medical influencers with the highest Influencer Score during the COVID-19 pandemic. This dataset was then used to construct a socio-semantic network, mapping both their identities and key topics, which are crucial for understanding their impact on public health discourse. To achieve this, we developed a few-shot multi-label classifier to identify influencers and their network actors' identities, employed BERTopic for extracting thematic content, and integrated these components into a network model to analyze their impact on health discourse. To ensure the reproducibility of our results, we have made the code available at https://github.com/ZhijinGuo/Medinfluencer.
Paper Structure (24 sections, 3 figures, 3 tables)

This paper contains 24 sections, 3 figures, 3 tables.

Figures (3)

  • Figure 1: Comparisons of LDA topics and BERTopic topics. The topics from BERTopic are more semantically coherent.
  • Figure 2: Message Exchange Among Social Network Actors
  • Figure 3: An illustration of message passing, showing the connections between users of diverse identities through various topics.