Table of Contents
Fetching ...

Real Talk, Virtual Faces: A Formal Concept Analysis of Personality and Sentiment in Influencer Audiences

Shahram Chaudhry, Sidahmed Benabderrahmane, Talal Rahwan

Abstract

Virtual influencers~(VIs) -- digitally synthetic social-media personas -- attract audiences whose discourse appears qualitatively different from discourse around human influencers~(HIs). Existing work characterises this difference through surveys or aggregate engagement statistics, which reveal \emph{what} audiences say but not \emph{how} multiple signals co-occur. We propose a two-layer, structure-first framework grounded in Formal Concept Analysis~(FCA) and association rule mining. The first layer applies FCA with support-based iceberg filtering to weekly-aggregated comment data, extracting discourse profiles -- weekly co-occurrence bundles of sentiment, Big Five personality cues, and topic tags. The second layer mines association rules at the comment level, revealing personality--sentiment--topic dependencies invisible to frequency-table analysis. Applied to YouTube comments from three VI--HI influencer pairs, the two-layer analysis reveals a consistent structural divergence: HI discourse concentrates into a single, emotionally regulated (stability-centred) regime (low neuroticism anchoring positivity), while VI discourse supports three structurally distinct discourse modes, including an appearance-discourse cluster absent from HI despite near-equal marginal prevalence. Topic-specific analyses further show that VI contexts exhibit negative sentiment in psychologically sensitive domains (mental health, body image, artificial identity) relative to HI contexts. Our results position FCA as a principled tool for multi-signal discourse analysis and demonstrate that virtuality reshapes not just what audiences say, but the underlying grammar of how signals co-occur in their reactions.

Real Talk, Virtual Faces: A Formal Concept Analysis of Personality and Sentiment in Influencer Audiences

Abstract

Virtual influencers~(VIs) -- digitally synthetic social-media personas -- attract audiences whose discourse appears qualitatively different from discourse around human influencers~(HIs). Existing work characterises this difference through surveys or aggregate engagement statistics, which reveal \emph{what} audiences say but not \emph{how} multiple signals co-occur. We propose a two-layer, structure-first framework grounded in Formal Concept Analysis~(FCA) and association rule mining. The first layer applies FCA with support-based iceberg filtering to weekly-aggregated comment data, extracting discourse profiles -- weekly co-occurrence bundles of sentiment, Big Five personality cues, and topic tags. The second layer mines association rules at the comment level, revealing personality--sentiment--topic dependencies invisible to frequency-table analysis. Applied to YouTube comments from three VI--HI influencer pairs, the two-layer analysis reveals a consistent structural divergence: HI discourse concentrates into a single, emotionally regulated (stability-centred) regime (low neuroticism anchoring positivity), while VI discourse supports three structurally distinct discourse modes, including an appearance-discourse cluster absent from HI despite near-equal marginal prevalence. Topic-specific analyses further show that VI contexts exhibit negative sentiment in psychologically sensitive domains (mental health, body image, artificial identity) relative to HI contexts. Our results position FCA as a principled tool for multi-signal discourse analysis and demonstrate that virtuality reshapes not just what audiences say, but the underlying grammar of how signals co-occur in their reactions.

Paper Structure

This paper contains 46 sections, 1 equation, 11 figures, 9 tables.

Figures (11)

  • Figure 1: Schematic concept lattice comparison. Left (HI): discourse converges to a single stability-centred chain (topic_positivity$\to$Neuroticism_low$\to$sentiment_Positive); 8 filtered rules, 1 discourse mode. Right (VI): discourse fans into three branches from an Openness_high backbone; 51 filtered rules, 3 discourse modes. The dashed orange border marks the appearance-discourse cluster absent from HI despite near-equal appearance prevalence (VI: 0.170 vs. HI: 0.159)---the core "when frequencies lie" result.
  • Figure 2: HI-only stable concepts
  • Figure 3: VI-only stable concepts
  • Figure 5: Sentiment distribution: artificial identity. VI shows $\approx$25% negative vs. $\approx$12% for HI.
  • Figure 6: Big Five profile: artificial identity. VI commenters score lower on Agreeableness and Conscientiousness.
  • ...and 6 more figures