What is Best for Students, Numerical Scores or Letter Grades?

Evi Micha; Shreyas Sekar; Nisarg Shah

What is Best for Students, Numerical Scores or Letter Grades?

Evi Micha, Shreyas Sekar, Nisarg Shah

Abstract

We study letter grading schemes, which are routinely employed for evaluating student performance. Typically, a numerical score obtained via one or more evaluations is converted into a letter grade (e.g., A+, B-, etc.) by associating a disjoint interval of numerical scores to each letter grade. We propose the first model for studying the (de)motivational effects of such grading on the students and, consequently, on their performance in future evaluations. We use the model to compare uniform letter grading schemes, in which the range of scores is divided into equal-length parts that are mapped to the letter grades, to numerical scoring, in which the score is not converted to any letter grade (equivalently, every score is its own letter grade). Theoretically, we identify realistic conditions under which numerical scoring is better than any uniform letter grading scheme. Our experiments confirm that this holds under even weaker conditions, but also find cases where the converse occurs.

What is Best for Students, Numerical Scores or Letter Grades?

Abstract

Paper Structure (10 sections, 9 theorems, 35 equations, 5 figures)

This paper contains 10 sections, 9 theorems, 35 equations, 5 figures.

Introduction
Model
Theoretical Results
Analyzing $\boldsymbol{\mathcal{D}^{\operatorname{same}}}$.
Analyzing $\boldsymbol{\mathcal{D}^{\operatorname{opp}}}$.
Experiments
Discussion
Intuition Regarding Dsame vs Dopp & Single-Peakedness
Useful Lemmas
Additional Experimental Results

Key Result

Theorem 1

When the true quality prior $\mathcal{Q}$ and the score model $\mathcal{S}$ are jointly symmetric, and the grading scheme $B$ is symmetric, then we have

Figures (5)

Figure 1: Performance of numerical scoring and different uniform letter grading schemes, with $\mu=65$, $\sigma=12$, $\gamma=1.5$ and $\alpha_d=0.5$ over different motivation coefficients (top) and number of evaluations (bottom). $95\%$ confidence intervals are shown.
Figure 2: Both figures show the probability density function of the score distribution $\mathcal{S}(q)$ when the true quality is $q=73$. The distribution is a truncated normal distribution with mean $q=73$, and standard deviation $\gamma = 1.7$ (top figure) and $\gamma = 6$ (bottom figure). The top figure conveys the intuition behind the conditions in \ref{['thm:weak-symm-impr-2']} and \ref{['clm:threetimesbucketing']}, which assume $\Pr[(q,s) \in \mathcal{D}^{\operatorname{same}}]$ to be sufficiently higher than $\Pr[(q,s) \in \mathcal{D}^{\operatorname{opp}}]$. The bottom figure conveys the intuition behind the observation used at the end of the proof of \ref{['thm:weak-symm-impr']}.
Figure 3: Performance of numerical scoring and different uniform letter grading schemes, with $\sigma=12$, $\gamma=1.5$ and $\alpha_d=0.5$, over different values of $\mu$.
Figure 4: Performance of numerical scoring and different uniform letter grading schemes, with $\mu=65$, $\gamma=1.5$ and $\alpha_d=0.5$, over different values of $\sigma$.
Figure 5: Performance of numerical scoring and different uniform letter grading schemes, with $\mu=65$, $\sigma=12$ and $\alpha_d=0.5$, over different values of $\gamma$.

Theorems & Definitions (21)

Definition 1: Jointly Symmetric Distributions
Definition 2: Symmetric Grading Scheme
Theorem 1
proof
Corollary 1
proof
Definition 3: Ex-Ante Single-Peaked Score Model
Theorem 2
proof
Definition 4: Ex-Post Single-Peaked Score Model
...and 11 more

What is Best for Students, Numerical Scores or Letter Grades?

Abstract

What is Best for Students, Numerical Scores or Letter Grades?

Authors

Abstract

Table of Contents

Key Result

Figures (5)

Theorems & Definitions (21)