How Well Can You Articulate that Idea? Insights from Automated Formative Assessment

Mahsa Sheikhi Karizaki; Dana Gnesdilow; Sadhana Puntambekar; Rebecca J. Passonneau

How Well Can You Articulate that Idea? Insights from Automated Formative Assessment

Mahsa Sheikhi Karizaki, Dana Gnesdilow, Sadhana Puntambekar, Rebecca J. Passonneau

TL;DR

The paper addresses automated formative feedback for open-ended science explanations that require articulating multiple ideas. It applies PyrEval, a pyramid-based semantic-vector tool, with a maximum-independent-set matching framework to map student clauses to six curriculum ideas, validated on GT1, GT2, and MidPhys datasets. The study reveals that feedback accuracy depends on the inherent distinctiveness of ideas and the clarity of student writing, with certain ideas being more formulaically stated and easier to detect. The findings highlight how automated analysis can inform teaching and revision practices by diagnosing articulation difficulties and guiding targeted support for students to express science ideas more clearly.

Abstract

Automated methods are becoming increasingly integrated into studies of formative feedback on students' science explanation writing. Most of this work, however, addresses students' responses to short answer questions. We investigate automated feedback on students' science explanation essays, where students must articulate multiple ideas. Feedback is based on a rubric that identifies the main ideas students are prompted to include in explanatory essays about the physics of energy and mass, given their experiments with a simulated roller coaster. We have found that students generally improve on revised versions of their essays. Here, however, we focus on two factors that affect the accuracy of the automated feedback. First, we find that the main ideas in the rubric differ with respect to how much freedom they afford in explanations of the idea, thus explanation of a natural law is relatively constrained. Students have more freedom in how they explain complex relations they observe in their roller coasters, such as transfer of different forms of energy. Second, by tracing the automated decision process, we can diagnose when a student's statement lacks sufficient clarity for the automated tool to associate it more strongly with one of the main ideas above all others. This in turn provides an opportunity for teachers and peers to help students reflect on how to state their ideas more clearly.

How Well Can You Articulate that Idea? Insights from Automated Formative Assessment

TL;DR

Abstract

Paper Structure (9 sections, 6 figures, 4 tables)

This paper contains 9 sections, 6 figures, 4 tables.

Introduction
Related Work
Roller Coaster Physics Curriculum
Data
Selection of a Semantic Vector Method for PyrEval
Feedback Accuracy and Distinctiveness of Ideas
Feedback Accuracy and Student Writing Clarity
Discussion
Conclusion

Figures (6)

Figure 1: Essay 1 Main Ideas with average cosine similarities within the pyramid content unit (Sim), and sample feedback checklist.
Figure 2: A student essay with very mixed writing quality.
Figure 3: Clauses with low versus high clarity, and main ideas they are similar to.
Figure 4: Cosine similarity distributions of clauses in the full assessment hypergraph for an accurate short essay, and a long inaccurate essay.
Figure 5: Cosine similarities of clauses in the val. set (N=117) to main idea 6 (high accuracy) vs. main idea 1 (low accuracy). The test set plot looks the same.
...and 1 more figures

How Well Can You Articulate that Idea? Insights from Automated Formative Assessment

TL;DR

Abstract

How Well Can You Articulate that Idea? Insights from Automated Formative Assessment

Authors

TL;DR

Abstract

Table of Contents

Figures (6)