The competent Computational Thinking test (cCTt): a valid, reliable and gender-fair test for longitudinal CT studies in grades 3-6
Laila El-Hamamsy, María Zapata-Cáceres, Estefanía Martín-Barroso, Francesco Mondada, Jessica Dehler Zufferey, Barbara Bruno, Marcos Román-González
TL;DR
This paper addresses the lack of longitudinal, developmentally appropriate CT assessments in primary school by validating the competent CT test (cCTt) across Grades 3–6. It combines Classical Test Theory and Item Response Theory analyses with measurement invariance and differential item functioning to establish validity, reliability, and gender fairness, while introducing normalised scoring and proficiency profiles to enable cross-grade comparability. Key contributions include grade-specific validity and reliability evidence, gender fairness confirmation, and the development of proficiency profiles and Wright maps to track cognitive maturation, plus normalised scoring to bridge cCTt with CTt and related instruments. The findings support using the cCTt for multi-year CT studies and provide practical tools for researchers, educators, and practitioners, while highlighting areas for item enrichment in higher grades and opportunities for cross-country validation and instrument transitions.
Abstract
The introduction of computing education into curricula worldwide requires multi-year assessments to evaluate the long-term impact on learning. However, no single Computational Thinking (CT) assessment spans primary school, and no group of CT assessments provides a means of transitioning between instruments. This study therefore investigated whether the competent CT test (cCTt) could evaluate learning reliably from grades 3 to 6 (ages 7-11) using data from 2709 students. The psychometric analysis employed Classical Test Theory, Item Response Theory, Measurement Invariance analyses which include Differential Item Functioning, normalised z-scoring, and PISA's methodology to establish proficiency levels. The findings indicate that the cCTt is valid, reliable and gender-fair for grades 3-6, although more complex items would be beneficial for grades 5-6. Grade-specific proficiency levels are provided to help tailor interventions, with a normalised scoring system to compare students across and between grades, and help establish transitions between instruments. To improve the utility of CT assessments among researchers, educators and practitioners, the findings emphasise the importance of i) developing and validating gender-fair, grade-specific, instruments aligned with students' cognitive maturation, and providing ii) proficiency levels, and iii) equivalency scales to transition between assessments. To conclude, the study provides insight into the design of longitudinal developmentally appropriate assessments and interventions.
