Table of Contents
Fetching ...

Topic-wise Exploration of the Telegram Group-verse

Alessandro Perlo, Giordano Paoletti, Nikhil Jha, Luca Vassio, Jussara Almeida, Marco Mellia

TL;DR

The paper addresses how user interaction patterns in Telegram public groups differ across topics. It introduces an open-source two-stage crawler that uses TGStat to discover groups and Telethon to collect messages, resulting in a longitudinal dataset of 51.6 M messages from 669 active groups over two months. Through per-topic characterisation, it reveals heterogeneity in bot usage, language distribution, message length, and non-textual elements, and uncovers patterns of video sharing, external linking, and repeated content that indicate spamming dynamics and coordinated behaviors. The study provides a foundation for cross-topic analyses of content dynamics in Telegram and offers data and tools to enable future research, with ethical safeguards to minimize impact on platforms and users.

Abstract

Although Telegram is currently one of the most popular instant messaging apps in the world, previous studies have mainly focused on analysing discussions on specific angles and topics. In this paper, we present a broad analysis of publicly accessible groups that cover a wide range of discussions, including Education, Erotic, Politics, and Cryptocurrencies. How do people interact with different topic groups? Is there any common or peculiar behaviour? We engineer and offer an open-source tool to automate the collection of messages from Telegram groups, a non-straightforward problem. We use it to collect more than 51 million messages from 669 groups. Here, we present a first-of-its-kind, per-topic analysis, contrasting the users' activity patterns from different angles -- the language, the presence of bots, the type and volume of shared media content, links to external platforms, etc. Our results confirm some anecdotal evidence, e.g., indications of spamming behaviour, and unveil some unexpected findings, e.g., the different sharing patterns of video and message length in groups of different topics. Our research provides a horizontal analysis of the public group in Telegram across various general topics, establishing a foundation for future studies that can delve deeper into user interactions and content dynamics within this unique messaging environment.

Topic-wise Exploration of the Telegram Group-verse

TL;DR

The paper addresses how user interaction patterns in Telegram public groups differ across topics. It introduces an open-source two-stage crawler that uses TGStat to discover groups and Telethon to collect messages, resulting in a longitudinal dataset of 51.6 M messages from 669 active groups over two months. Through per-topic characterisation, it reveals heterogeneity in bot usage, language distribution, message length, and non-textual elements, and uncovers patterns of video sharing, external linking, and repeated content that indicate spamming dynamics and coordinated behaviors. The study provides a foundation for cross-topic analyses of content dynamics in Telegram and offers data and tools to enable future research, with ethical safeguards to minimize impact on platforms and users.

Abstract

Although Telegram is currently one of the most popular instant messaging apps in the world, previous studies have mainly focused on analysing discussions on specific angles and topics. In this paper, we present a broad analysis of publicly accessible groups that cover a wide range of discussions, including Education, Erotic, Politics, and Cryptocurrencies. How do people interact with different topic groups? Is there any common or peculiar behaviour? We engineer and offer an open-source tool to automate the collection of messages from Telegram groups, a non-straightforward problem. We use it to collect more than 51 million messages from 669 groups. Here, we present a first-of-its-kind, per-topic analysis, contrasting the users' activity patterns from different angles -- the language, the presence of bots, the type and volume of shared media content, links to external platforms, etc. Our results confirm some anecdotal evidence, e.g., indications of spamming behaviour, and unveil some unexpected findings, e.g., the different sharing patterns of video and message length in groups of different topics. Our research provides a horizontal analysis of the public group in Telegram across various general topics, establishing a foundation for future studies that can delve deeper into user interactions and content dynamics within this unique messaging environment.
Paper Structure (24 sections, 11 figures, 4 tables)

This paper contains 24 sections, 11 figures, 4 tables.

Figures (11)

  • Figure 1: Fraction of messages sent by bots in a group across topics.
  • Figure 2: Most popular language in each group.
  • Figure 3: ECCDF of the number of characters in text messages grouped by topic (groups in English).
  • Figure 4: Median fraction of messages with non-textual elements in selected topics.
  • Figure 5: Distribution of the video duration (on the left, bin every 10 minutes) and comparison between video size and video duration (on the right, with regression line reported).
  • ...and 6 more figures