Table of Contents
Fetching ...

FAIR: Framing AIs Role in Programming Competitions -- Understanding How LLMs Are Changing the Game in Competitive Programming

Dongyijie Primo Pan, Lan Luo, Ji Zhu, Zhiqi Gao, Xin Tong, Pan Hui

TL;DR

The paper investigates how large language models (LLMs) are reshaping competitive programming by analyzing stakeholder workflows, fairness norms, and governance. It employs 37 in-depth interviews, a global survey of 207 contestants, and Codeforces contestLogs (2022–2025) to triangulate changes in practice and integrity enforcement. A key contribution is a chess-inspired governance framework combining anomaly detection, expert review, and grassroots community oversight to preserve fairness, transparency, and credibility amid AI-enabled misuse. The findings show LLMs accelerate post-contest learning and tooling but create gray zones and incentives for misuse in high-stakes settings, necessitating layered, auditable policies and community participation. Practically, the work guides platform operators and educators toward balanced rules, disclosure norms, and multi-actor oversight to sustain the educational value of competitive programming in the AI era.

Abstract

This paper investigates how large language models (LLMs) are reshaping competitive programming. The field functions as an intellectual contest within computer science education and is marked by rapid iteration, real-time feedback, transparent solutions, and strict integrity norms. Prior work has evaluated LLMs performance on contest problems, but little is known about how human stakeholders -- contestants, problem setters, coaches, and platform stewards -- are adapting their workflows and contest norms under LLMs-induced shifts. At the same time, rising AI-assisted misuse and inconsistent governance expose urgent gaps in sustaining fairness and credibility. Drawing on 37 interviews spanning all four roles and a global survey of 207 contestants, as well as an API-based crawl of Codeforces contest logs (2022-2025) for quantitative analysis, we contribute: (i) an empirical account of evolving workflows, (ii) an analysis of contested fairness norms, and (iii) a chess-inspired governance approach with actionable measures -- real-time LLMs checks in online contests, peer co-monitoring and reporting, and cross-validation against offline performance -- to curb LLMs-assisted misuse while preserving fairness, transparency, and credibility.

FAIR: Framing AIs Role in Programming Competitions -- Understanding How LLMs Are Changing the Game in Competitive Programming

TL;DR

The paper investigates how large language models (LLMs) are reshaping competitive programming by analyzing stakeholder workflows, fairness norms, and governance. It employs 37 in-depth interviews, a global survey of 207 contestants, and Codeforces contestLogs (2022–2025) to triangulate changes in practice and integrity enforcement. A key contribution is a chess-inspired governance framework combining anomaly detection, expert review, and grassroots community oversight to preserve fairness, transparency, and credibility amid AI-enabled misuse. The findings show LLMs accelerate post-contest learning and tooling but create gray zones and incentives for misuse in high-stakes settings, necessitating layered, auditable policies and community participation. Practically, the work guides platform operators and educators toward balanced rules, disclosure norms, and multi-actor oversight to sustain the educational value of competitive programming in the AI era.

Abstract

This paper investigates how large language models (LLMs) are reshaping competitive programming. The field functions as an intellectual contest within computer science education and is marked by rapid iteration, real-time feedback, transparent solutions, and strict integrity norms. Prior work has evaluated LLMs performance on contest problems, but little is known about how human stakeholders -- contestants, problem setters, coaches, and platform stewards -- are adapting their workflows and contest norms under LLMs-induced shifts. At the same time, rising AI-assisted misuse and inconsistent governance expose urgent gaps in sustaining fairness and credibility. Drawing on 37 interviews spanning all four roles and a global survey of 207 contestants, as well as an API-based crawl of Codeforces contest logs (2022-2025) for quantitative analysis, we contribute: (i) an empirical account of evolving workflows, (ii) an analysis of contested fairness norms, and (iii) a chess-inspired governance approach with actionable measures -- real-time LLMs checks in online contests, peer co-monitoring and reporting, and cross-validation against offline performance -- to curb LLMs-assisted misuse while preserving fairness, transparency, and credibility.

Paper Structure

This paper contains 56 sections, 4 equations, 5 figures, 9 tables.

Figures (5)

  • Figure 1: Geographical coverage of survey respondents. Countries and regions with respondents are shown in blue.
  • Figure 2: Perceptions of LLM usage across rating groups. Bars show means with SEM. Pairwise Mann–Whitney U tests (Bonferroni corrected) indicate that novices perceive significantly greater efficiency and performance gains than Masters+ ($p < .05$), whereas concerns about over-reliance show no significant differences.
  • Figure 3: Workflow change of problem setters. While the ideation and screening stage remains fully human-driven, subsequent steps of the pipeline (drafting, proofing, solution writing, validation, translation, and polishing) increasingly integrate LLM assistance for low-creativity and repetitive tasks. Importantly, setters emphasized that AI should never intrude into the creative ideation phase, as it undermines the originality and aesthetic value of competitive programming problem design.
  • Figure 4: Contestants’ opinions on current policies and gray zones. Stacked bars show Likert distributions (1=Strongly disagree, 5=Strongly agree) for five statements: banning all AI use, trusting current rules, allowing only minor help, adopting tiered allowances, and requiring disclosure. Results show broad support for banning AI and mandatory disclosure, general trust in existing rules, but sharp polarization on minor help and tiered allowance.
  • Figure 5: Radar chart of governance-related survey items. Dimensions include compliance with organizer rules, mandatory disclosure, policy awareness, transparency, voice in rule-making, trust in enforcement, and satisfaction with handling. High compliance and support for disclosure contrast with weaker trust and satisfaction, revealing a clear compliance–trust gap.