Table of Contents
Fetching ...

On the Influence of Reading Sequences on Knowledge Gain during Web Search

Wolfgang Gritz, Anett Hoppe, Ralph Ewerth

TL;DR

The authors address how reading sequences during web search relate to knowledge gain by extending a line-based reading model to detect cross-line reading across rendered web pages. They integrate eye-tracking fixations with page text positioning and apply a Viterbi-based word assignment to identify reading sequences spanning lines and paragraphs, evaluated on the SaL-Lightning dataset. Their findings show that higher knowledge gain associates with longer reading times, more words read, and faster reading with backward regressions, while higher pre-existing knowledge corresponds to deeper reading without necessarily higher KG, suggesting skim-and-regress strategies may support learning. The work provides a public codebase and emphasizes reading-behavior features on textual content pages as predictors of learning outcomes, with limitations regarding language direction and modality of content.

Abstract

Nowadays, learning increasingly involves the usage of search engines and web resources. The related interdisciplinary research field search as learning aims to understand how people learn on the web. Previous work has investigated several feature classes to predict, for instance, the expected knowledge gain during web search. Therein, eye-tracking features have not been extensively studied so far. In this paper, we extend a previously used reading model from a line-based one to one that can detect reading sequences across multiple lines. We use publicly available study data from a web-based learning task to examine the relationship between our feature set and the participants' test scores. Our findings demonstrate that learners with higher knowledge gain spent significantly more time reading, and processing more words in total. We also find evidence that faster reading at the expense of more backward regressions may be an indicator of better web-based learning. We make our code publicly available at https://github.com/TIBHannover/reading_web_search.

On the Influence of Reading Sequences on Knowledge Gain during Web Search

TL;DR

The authors address how reading sequences during web search relate to knowledge gain by extending a line-based reading model to detect cross-line reading across rendered web pages. They integrate eye-tracking fixations with page text positioning and apply a Viterbi-based word assignment to identify reading sequences spanning lines and paragraphs, evaluated on the SaL-Lightning dataset. Their findings show that higher knowledge gain associates with longer reading times, more words read, and faster reading with backward regressions, while higher pre-existing knowledge corresponds to deeper reading without necessarily higher KG, suggesting skim-and-regress strategies may support learning. The work provides a public codebase and emphasizes reading-behavior features on textual content pages as predictors of learning outcomes, with limitations regarding language direction and modality of content.

Abstract

Nowadays, learning increasingly involves the usage of search engines and web resources. The related interdisciplinary research field search as learning aims to understand how people learn on the web. Previous work has investigated several feature classes to predict, for instance, the expected knowledge gain during web search. Therein, eye-tracking features have not been extensively studied so far. In this paper, we extend a previously used reading model from a line-based one to one that can detect reading sequences across multiple lines. We use publicly available study data from a web-based learning task to examine the relationship between our feature set and the participants' test scores. Our findings demonstrate that learners with higher knowledge gain spent significantly more time reading, and processing more words in total. We also find evidence that faster reading at the expense of more backward regressions may be an indicator of better web-based learning. We make our code publicly available at https://github.com/TIBHannover/reading_web_search.
Paper Structure (15 sections, 1 equation, 1 table)