Analyzing Gender Polarity in Short Social Media Texts with BERT: The Role of Emojis and Emoticons

Saba Yousefian Jazi; Amir Mirzaeinia; Sina Yousefian Jazi

Analyzing Gender Polarity in Short Social Media Texts with BERT: The Role of Emojis and Emoticons

Saba Yousefian Jazi, Amir Mirzaeinia, Sina Yousefian Jazi

TL;DR

This work tackles the problem of inferring an author’s gender from short Twitter texts by augmenting a BERT-based classifier with nonword cues such as emojis, emoticons, and user mentions. It details a fine-tuning setup for a BERT-base-uncased model with a sigmoid head, dropout of $0.1\%$, and learning rate $2e-5$ across $10$ epochs on a Tesla T4, using a dataset of 50/50 gender-balanced celebrity tweets after emoji-to-text replacements. Experiments reveal that mentions and emoji substitutions significantly influence gender-polarity predictions and that sentiment signals partially explain some misclassifications, though data biases and incomplete emoji coverage limit conclusions. The findings emphasize stylometric cues in short-text gender profiling and suggest improving robustness through broader, more diverse data and more comprehensive emoji handling.

Abstract

In this effort we fine tuned different models based on BERT to detect the gender polarity of twitter accounts. We specially focused on analyzing the effect of using emojis and emoticons in performance of our model in classifying task. We were able to demonstrate that the use of these none word inputs alongside the mention of other accounts in a short text format like tweet has an impact in detecting the account holder's gender.

Analyzing Gender Polarity in Short Social Media Texts with BERT: The Role of Emojis and Emoticons

TL;DR

, and learning rate

across

epochs on a Tesla T4, using a dataset of 50/50 gender-balanced celebrity tweets after emoji-to-text replacements. Experiments reveal that mentions and emoji substitutions significantly influence gender-polarity predictions and that sentiment signals partially explain some misclassifications, though data biases and incomplete emoji coverage limit conclusions. The findings emphasize stylometric cues in short-text gender profiling and suggest improving robustness through broader, more diverse data and more comprehensive emoji handling.

Abstract

Paper Structure (9 sections, 2 figures, 3 tables)

This paper contains 9 sections, 2 figures, 3 tables.

Introduction
Background
Pretrained Model
Experiments
Data
Experiment with mentions
Sentiment analyses
Discussion of results
Conclusion

Figures (2)

Figure 1: Confusion matrix of original experiment: replacing the emojis with text and including mentions.
Figure 2: Confusion matrix of second experiment: replacing the emojis with text and removing mentions.

Analyzing Gender Polarity in Short Social Media Texts with BERT: The Role of Emojis and Emoticons

TL;DR

Abstract

Analyzing Gender Polarity in Short Social Media Texts with BERT: The Role of Emojis and Emoticons

Authors

TL;DR

Abstract

Table of Contents

Figures (2)