Table of Contents
Fetching ...

How Language, Culture, and Geography shape Online Dialogue: Insights from Koo

Amin Mekacher, Max Falkenberg, Andrea Baronchelli

TL;DR

It is shown that for language groups of similar sizes, Indian languages fostered higher discourse diversity than non-Indian languages, possibly highlighting synergistic effects which boosted the uptake and retention of these groups.

Abstract

Koo is a microblogging platform based in India launched in 2020 with the explicit aim of catering to non-Western communities in their vernacular languages. With a near-complete dataset totalling over 71M posts and 399M user interactions, we show how Koo has attracted users from several countries including India, Nigeria and Brazil, but with variable levels of sustained user engagement. We highlight how Koo's interaction network has been shaped by multiple country-specific migrations and displays strong divides between linguistic and cultural communities, for instance, with English-speaking communities from India and Nigeria largely isolated from one another. Finally, we analyse the content shared by each linguistic community and identify cultural patterns that promote similar discourses across language groups. Our study raises the prospect that a multilingual and politically diverse platform like Koo may be able to cultivate vernacular communities that have, historically, not been prioritised by US-based social media platforms.

How Language, Culture, and Geography shape Online Dialogue: Insights from Koo

TL;DR

It is shown that for language groups of similar sizes, Indian languages fostered higher discourse diversity than non-Indian languages, possibly highlighting synergistic effects which boosted the uptake and retention of these groups.

Abstract

Koo is a microblogging platform based in India launched in 2020 with the explicit aim of catering to non-Western communities in their vernacular languages. With a near-complete dataset totalling over 71M posts and 399M user interactions, we show how Koo has attracted users from several countries including India, Nigeria and Brazil, but with variable levels of sustained user engagement. We highlight how Koo's interaction network has been shaped by multiple country-specific migrations and displays strong divides between linguistic and cultural communities, for instance, with English-speaking communities from India and Nigeria largely isolated from one another. Finally, we analyse the content shared by each linguistic community and identify cultural patterns that promote similar discourses across language groups. Our study raises the prospect that a multilingual and politically diverse platform like Koo may be able to cultivate vernacular communities that have, historically, not been prioritised by US-based social media platforms.
Paper Structure (7 sections, 7 equations, 6 figures)

This paper contains 7 sections, 7 equations, 6 figures.

Figures (6)

  • Figure 1: Daily number of registrations on Koo, and the impact of collective migration. 7-day moving average of the daily number of registrations on Koo, from the beginning of 2020 to early 2023. The dashed lines indicate, in order: the migration of BJP politicians and their supporters following the Indian Farmers' Protest in February 2021; the migration of the Nigerian government after Twitter was banned in the country in June 2021; the Brazilian community joining Koo in November 2022 after Elon Musk purchased Twitter.
  • Figure 2: Heterogeneous user retention for various linguistic communities. Kaplan-Meier survival curves for the main linguistic communities on Koo, showing the fraction of users who remained active after a given number of days. For each user, we define "day zero" as being their registration date on Koo. Other linguistic communities are displayed in grey. The retention curve is displayed until the day that fewer than 1% of users from a linguistic community remain active.
  • Figure 3: The Koo interaction network and the impact of linguistic homophily on the network's structure. Each node represents a user, and two nodes are connected if one of the users interacted with the other user's content. Users are coloured according to their modal language on the platform. The main linguistic communities are the Hindi-speaking users (blue), English-speaking users (green), Nigerian users (purple) and Portuguese-speaking users (yellow). The layout is generated by using a force-directed graph drawing method. A) The total interaction network. B) The $k$-core of the interaction network with $k = 150$. C) The Shannon entropy of the modal language of the nodes belonging to the k-core of the graph, with respect to the value of $k$. The entropy of the interaction network is compared to the value obtained in a null model, where we shuffle the modal language associated to each node in the network.
  • Figure 4: EI-Homophily and language commitment and the impact of a community's size on its sustainability. Number of users belonging to a linguistic community plotted against A) their commitment to their modal language, and B) their EI homophily index. Both metrics are averaged by the number of users for whom the language measured is their modal language. The coloured dots represent the Hindi-speaking community (blue), English (green), Portuguese (yellow) and Nigerian English (purple). The dashed line indicates an average homophily equal to 0.
  • Figure 5: Global language network and multilingual activity. A) The correlation measured from the global language network. Two languages with a positive correlation share more connections than expected based on their respective number of speakers, and is negative otherwise. B) The t-statistic for each pair of languages in the global language network. Blue cells indicate that the link between the two languages is significant with respect to the t-statistic, whereas red cells highlight non-significant links. A link is considered significant if $p < 0.05$.
  • ...and 1 more figures