Table of Contents
Fetching ...

From NLG Evaluation to Modern Student Assessment in the Era of ChatGPT: The Great Misalignment Problem and Pedagogical Multi-Factor Assessment (P-MFA)

Mika Hämäläinen, Kimmo Leiviskä

TL;DR

This paper identifies a Great Misalignment Problem at the intersection of NLG evaluation and student assessment in AI-enabled education, where evaluation often tracks surface artefacts rather than authentic learning processes. It argues for a process-focused shift and introduces Pedagogical Multi-Factor Assessment (P-MFA), a multi-factor, process-anchored framework inspired by multi-factor authentication. P-MFA combines six evidence streams—knowledge, outputs, application, continuity, reflection, and context—to align learning definitions, methods, and evaluations, increasing transparency and resilience against AI-facilitated artefacts. The work offers a principled path to reframe grading as a dialogic, diagnostic activity that preserves pedagogical integrity in the era of ChatGPT and similar tools.

Abstract

This paper explores the growing epistemic parallel between NLG evaluation and grading of students in a Finnish University. We argue that both domains are experiencing a Great Misalignment Problem. As students increasingly use tools like ChatGPT to produce sophisticated outputs, traditional assessment methods that focus on final products rather than learning processes have lost their validity. To address this, we introduce the Pedagogical Multi-Factor Assessment (P-MFA) model, a process-based, multi-evidence framework inspired by the logic of multi-factor authentication.

From NLG Evaluation to Modern Student Assessment in the Era of ChatGPT: The Great Misalignment Problem and Pedagogical Multi-Factor Assessment (P-MFA)

TL;DR

This paper identifies a Great Misalignment Problem at the intersection of NLG evaluation and student assessment in AI-enabled education, where evaluation often tracks surface artefacts rather than authentic learning processes. It argues for a process-focused shift and introduces Pedagogical Multi-Factor Assessment (P-MFA), a multi-factor, process-anchored framework inspired by multi-factor authentication. P-MFA combines six evidence streams—knowledge, outputs, application, continuity, reflection, and context—to align learning definitions, methods, and evaluations, increasing transparency and resilience against AI-facilitated artefacts. The work offers a principled path to reframe grading as a dialogic, diagnostic activity that preserves pedagogical integrity in the era of ChatGPT and similar tools.

Abstract

This paper explores the growing epistemic parallel between NLG evaluation and grading of students in a Finnish University. We argue that both domains are experiencing a Great Misalignment Problem. As students increasingly use tools like ChatGPT to produce sophisticated outputs, traditional assessment methods that focus on final products rather than learning processes have lost their validity. To address this, we introduce the Pedagogical Multi-Factor Assessment (P-MFA) model, a process-based, multi-evidence framework inspired by the logic of multi-factor authentication.

Paper Structure

This paper contains 6 sections.