Medfluencer: A Network Representation of Medical Influencers' Identities and Discourse on Social Media
Zhijin Guo, Edwin Simpson, Roberta Bernardi
TL;DR
The paper addresses how medical influencers shape public health discourse on social media by proposing a socio-semantic network framework that jointly models actor identities and discursive frames. It combines a few-shot, prompt-based, multi-label identity classifier with BERTopic and LDA topic modeling to map identities, network ties, and frames onto a unified network, using a dataset of the top-100 medical influencers on X during the COVID-19 era. The study finds that BERTopic yields more coherent frame representations than LDA and demonstrates meaningful cross-identity interactions and message propagation within the network, offering actionable insights for public health campaigns. Code is released to support reproducibility and further research in health communication analytics.
Abstract
In our study, we first constructed a dataset from the tweets of the top 100 medical influencers with the highest Influencer Score during the COVID-19 pandemic. This dataset was then used to construct a socio-semantic network, mapping both their identities and key topics, which are crucial for understanding their impact on public health discourse. To achieve this, we developed a few-shot multi-label classifier to identify influencers and their network actors' identities, employed BERTopic for extracting thematic content, and integrated these components into a network model to analyze their impact on health discourse. To ensure the reproducibility of our results, we have made the code available at https://github.com/ZhijinGuo/Medinfluencer.
