Using Mathlink Cubes to Introduce Data Wrangling with Examples in R
Lucy D'Agostino McGowan
TL;DR
The paper addresses the challenge of teaching data wrangling to undergraduates by introducing a tangible, manipulative-based approach that precedes coding. It introduces mathlink cubes as a physical representation of a data frame and maps common wrangling operations to cube manipulations before translating them into R/dplyr code. Through a 75-minute classroom activity with groups of three, the authors demonstrate how filtering, selecting, mutating, arranging, grouping, and summarizing can be practiced hands-on and then implemented in R, with positive student feedback. The work suggests that concrete manipulatives can facilitate collaboration, reduce coding anxiety, and provide a scalable blueprint for undergraduate data science pedagogy, with resources and slides available for replication.
Abstract
This paper explores an innovative approach to teaching data wrangling skills to students through hands-on activities before transitioning to coding. Data wrangling, a critical aspect of data analysis, involves cleaning, transforming, and restructuring data. We introduce the use of a physical tool, mathlink cubes, to facilitate a tangible understanding of data sets. This approach helps students grasp the concepts of data wrangling before implementing them in coding languages such as R. We detail a classroom activity that includes hands-on tasks paralleling common data wrangling processes such as filtering, selecting, and mutating, followed by their coding equivalents using R's `dplyr` package.
