Reporting Eye-Tracking Data Quality: Towards a New Standard
Deborah N. Jakobi, Daniel G. Krakowczyk, Lena A. Jäger
TL;DR
This paper addresses the lack of standardized data quality reporting in eye-tracking datasets by proposing to publish data at all pre-processing stages alongside automated data-quality reports. It introduces concrete metadata, quality metrics, and an open-source implementation integrated into the pymovements package to enable reproducible, cross-dataset comparisons. The key contributions include trial- and session-level quality reports, calibration/validation metadata, data-loss metrics, and stimulus-dependent quality measures that facilitate reuse across diverse analyses. The approach aligns with FAIR principles and aims to shift data-sharing practices toward greater transparency, interoperability, and device-agnostic reporting with attention to privacy and industry implications.
Abstract
Eye-tracking datasets are often shared in the format used by their creators for their original analyses, usually resulting in the exclusion of data considered irrelevant to the primary purpose. In order to increase re-usability of existing eye-tracking datasets for more diverse and initially not considered use cases, this work advocates a new approach of sharing eye-tracking data. Instead of publishing filtered and pre-processed datasets, the eye-tracking data at all pre-processing stages should be published together with data quality reports. In order to transparently report data quality and enable cross-dataset comparisons, we develop data quality reporting standards and metrics that can be automatically applied to a dataset, and integrate them into the open-source Python package pymovements (https://github.com/aeye-lab/pymovements).
