Whose ChatGPT? Unveiling Real-World Educational Inequalities Introduced by Large Language Models

Renzhe Yu; Zhen Xu; Sky CH-Wang; Richard Arum

Whose ChatGPT? Unveiling Real-World Educational Inequalities Introduced by Large Language Models

Renzhe Yu, Zhen Xu, Sky CH-Wang, Richard Arum

Abstract

The universal availability of ChatGPT and other similar tools since late 2022 has prompted tremendous public excitement and experimental effort about the potential of large language models (LLMs) to improve learning experience and outcomes, especially for learners from disadvantaged backgrounds. However, little research has systematically examined the real-world impacts of LLM availability on educational equity beyond theoretical projections and controlled studies of innovative LLM applications. To depict trends of post-LLM inequalities, we analyze 1,140,328 academic writing submissions from 16,791 college students across 2,391 courses between 2021 and 2024 at a public, minority-serving institution in the US. We find that students' overall writing quality gradually increased following the availability of LLMs and that the writing quality gaps between linguistically advantaged and disadvantaged students became increasingly narrower. However, this equitizing effect was more concentrated on students with higher socioeconomic status. These findings shed light on the digital divides in the era of LLMs and raise questions about the equity benefits of LLMs in early stages and highlight the need for researchers and practitioners on developing responsible practices to improve educational equity through LLMs.

Whose ChatGPT? Unveiling Real-World Educational Inequalities Introduced by Large Language Models

Abstract

Paper Structure

This paper contains 10 figures, 6 tables.

Figures (10)

Figure S1: Estimated changes in academic writing proficiency among linguistically advantaged student groups, by socioeconomic status (SES). Each bar shows the predicted average change (in standard deviation) in a composite writing proficiency index between a post-LLM phase and the pre-LLM period for a given student subgroup. Phase 1: January 2023 to June 2023; Phase 2: October 2023 to March 2024. Error bars indicate 90% confidence intervals. Statistical significance for the difference between adjacent bars (three-way interaction term between indicators of the linguistic group, socioeconomic group and phase in the regression model) is denoted by asterisks: $p<0.10(\cdot)$, $p<0.05(^*)$, $p<0.01(^{**})$, $p<0.001(^{***})$.
Figure S2: Estimated changes in academic writing proficiency, for writing submissions with greater grading variability. The sample only includes writing submissions to assignments where the grading was not limited to a binary scale such as 0 or full credit. Each bar represents the estimate of the average change (in standard deviation) in a composite writing proficiency index between a post-LLM phase and the pre-LLM period in the data. Phase 1: January 2023 to June 2023; Phase 2: October 2023 to March 2024. Error bars indicate 90% confidence intervals.
Figure S3: Estimated changes in academic writing proficiency, by linguistic background, for writing submissions with greater grading variability. The sample only includes writing submissions to assignments where the grading was not limited to a binary scale such as 0 or full credit. Each bar shows the predicted average change (in standard deviation) in a composite writing proficiency index between a post-LLM phase and the pre-LLM period for a given student group. Phase 1: January 2023 to June 2023; Phase 2: October 2023 to March 2024. Error bars indicate 90% confidence intervals. Statistical significance for the difference between adjacent bars (interaction term between the group and the phase indicators in the regression model) is denoted by asterisks: $p<0.10(\cdot)$, $p<0.05(^*)$, $p<0.01(^{**})$, $p<0.001(^{***})$.
Figure S4: Estimated changes in academic writing proficiency among linguistically disadvantaged student groups, by socioeconomic status (SES), for writing submissions with greater grading variability. The sample only includes writing submissions to assignments where the grading was not limited to a binary scale such as 0 or full credit. Each bar shows the predicted average change (in standard deviation) in a composite writing proficiency index between a post-LLM phase and the pre-LLM period for a given student subgroup. Phase 1: January 2023 to June 2023; Phase 2: October 2023 to March 2024. Error bars indicate 90% confidence intervals. Statistical significance for the difference between adjacent bars (three-way interaction term between indicators of the linguistic group, socioeconomic group and phase in the regression model) is denoted by asterisks: $p<0.10(\cdot)$, $p<0.05(^*)$, $p<0.01(^{**})$, $p<0.001(^{***})$.
Figure S5: Estimated changes in academic writing proficiency, for writing submissions with longer content. The sample only includes writing submissions to assignments where the median submission length was over 100 words. Each bar represents the estimate of the average change (in standard deviation) in a composite writing proficiency index between a post-LLM phase and the pre-LLM period in the data. Phase 1: January 2023 to June 2023; Phase 2: October 2023 to March 2024. Error bars indicate 90% confidence intervals.
...and 5 more figures