Characterizing Information Shared by Participants to Coding Challenges: The Case of Advent of Code
Francesco Cauteruccio, Enrico Corradini, Luca Virgili
TL;DR
The paper addresses how information shared by AoC participants on Reddit evolves, focusing on participation, language adoption, and resiliency across 2019-2021 editions. It builds a dataset from the /r/adventofcode megathreads, identifies programming languages via a four-step pipeline, and represents interactions with an extended stream graph $S=(T,V,W,E,L)$ where edges carry language labels. Key findings include that top languages (e.g., Python and Rust) are stable across years, participants tend to use a single language within an edition while frequently switching between editions, and languages categorized as Loved or Popular show longer maximal usage and attract switches. The work offers a data-driven lens for educational gamification and programming competitions and demonstrates a replicable methodology for analyzing online learning communities and coding challenge discussions.
Abstract
Advent of Code (AoC from now on) is a popular coding challenge requiring to solve programming puzzles for a variety of skill sets and levels. AoC follows the advent calendar, therefore it is an annual challenge that lasts for 25 days. AoC participants usually post their solutions on social networks and discuss them online. These challenges are interesting to study since they could highlight the adoption of new tools, the evolution of the developer community, or the technological requirements of well-known companies. For these reasons, we first create a dataset of the 2019-2021 AoC editions containing the discussion threads made on the subreddit {\tt /r/adventofcode}. Then, we propose a model based on stream graphs to best study this context, where we represent its most important actors through time: participants, comments, and programming languages. Thanks to our model, we investigate user participation, adoption of new programming languages during a challenge and between two of them, and resiliency of programming languages based on a Stack Overflow survey. We find that the top-used programming languages are almost the same in the three years, pointing out their importance. Moreover, participants tend to keep the same programming language for the whole challenge, while the ones attending two AoCs usually change it in the next one. Finally, we observe interesting results about the programming languages that are ``Popular'' or ``Loved'' according to the Stack Overflow survey. Firstly, these are the ones adopted for the longest time in an AoC edition, thanks to which users have a high chance of reaching the end of the challenge. Secondly, they are the most chosen when a participant decides to change programming language during the same challenge.
