Table of Contents
Fetching ...

The Koo Dataset: An Indian Microblogging Platform With Global Ambitions

Amin Mekacher, Max Falkenberg, Andrea Baronchelli

TL;DR

The largest publicly available Koo dataset, spanning from the platform’s founding in early 2020 to September 2023, providing detailed metadata for 72M posts, 75M comments, 40M shares, 284M likes and 1.4M user profiles is presented.

Abstract

Increasingly, alternative platforms are playing a key role in the social media ecosystem. Koo, a microblogging platform based in India, has emerged as a major new social network hosting high profile politicians from several countries (India, Brazil, Nigeria) and many internationally renowned celebrities. This paper presents the largest publicly available Koo dataset, spanning from the platform's founding in early 2020 to September 2023, providing detailed metadata for 72M posts, 75M comments, 40M shares, 284M likes and 1.4M user profiles. Along with the release of the dataset, we provide an overview of the platform including a discussion of the news ecosystem on the platform, hashtag usage and user engagement. Our results highlight the pivotal role that new platforms play in shaping online communities in emerging economies and the Global South, connecting local politicians and public figures with their followers. With Koo's ambition to become the town hall for diverse non-English speaking communities, our dataset offers new opportunities for studying social media beyond a Western context.

The Koo Dataset: An Indian Microblogging Platform With Global Ambitions

TL;DR

The largest publicly available Koo dataset, spanning from the platform’s founding in early 2020 to September 2023, providing detailed metadata for 72M posts, 75M comments, 40M shares, 284M likes and 1.4M user profiles is presented.

Abstract

Increasingly, alternative platforms are playing a key role in the social media ecosystem. Koo, a microblogging platform based in India, has emerged as a major new social network hosting high profile politicians from several countries (India, Brazil, Nigeria) and many internationally renowned celebrities. This paper presents the largest publicly available Koo dataset, spanning from the platform's founding in early 2020 to September 2023, providing detailed metadata for 72M posts, 75M comments, 40M shares, 284M likes and 1.4M user profiles. Along with the release of the dataset, we provide an overview of the platform including a discussion of the news ecosystem on the platform, hashtag usage and user engagement. Our results highlight the pivotal role that new platforms play in shaping online communities in emerging economies and the Global South, connecting local politicians and public figures with their followers. With Koo's ambition to become the town hall for diverse non-English speaking communities, our dataset offers new opportunities for studying social media beyond a Western context.
Paper Structure (7 sections, 5 figures, 4 tables)

This paper contains 7 sections, 5 figures, 4 tables.

Figures (5)

  • Figure 1: An example of a koo. The main panel includes the original post in Hindi, its translation in English and an image. The top panel provides information about the poster, including their user handle, profile picture, their self-declared title, the yellow tick of eminence (if applicable), and the post creation date. The bottom panel allows logged-in users to comment, share or like the post. Koo provides additional icons to share posts on other platforms.
  • Figure 2: Co-occurrence network of accounts of eminence. Two eminent users are connected by an edge if at least 50 accounts on Koo interact with both of them. Nodes are coloured according to modal account language. Node shapes differentiate Indian and non-Indian languages.
  • Figure 3: Daily activity and number of active users. A) 7-day moving window average of the amount of content (posts, comments, likes and shares) posted on Koo. B) 7-day moving average of the number of active users on a given day. A user is considered active if they created a new post or if they commented, shared or liked an existing post. The dashed lines indicate the events that led to the major collective migrations on Koo, namely 1) the Farmers' Protest in India; 2) Twitter getting banned in Nigeria and 3) Elon Musk's purchasing Twitter and the subsequent Brazilian migration.
  • Figure 4: Top-shared web domains and their prevalence in the dominant linguistic communities. Number of links leading to a web domain shared by the top-10 linguistic communities on Koo. The top-20 shared domains are shown.
  • Figure 5: Media plurality across linguistic communities on Koo. The Gini coefficient for the news media web domains shared by each linguistic community, plotted against the population size of each linguistic community. A Gini coefficient close to 1 highlights a monopoly held by a single news source, whereas a Gini coefficient close to 0 indicates a more diverse link-sharing ecosystem.