Table of Contents
Fetching ...

Measuring individual semantic networks: A simulation study

Samuel Aeschbach, Rui Mata, Dirk U. Wulff

Abstract

Accurately capturing individual differences in semantic networks is fundamental to advancing our mechanistic understanding of semantic memory. Past empirical attempts to construct individual-level semantic networks from behavioral paradigms may be limited by data constraints. To assess these limitations and propose improved designs for the measurement of individual semantic networks, we conducted a recovery simulation investigating the psychometric properties underlying estimates of individual semantic networks obtained from two different behavioral paradigms: free associations and relatedness judgment tasks. Our results show that successful inference of semantic networks is achievable, but they also highlight critical challenges. Estimates of absolute network characteristics are severely biased, such that comparisons between behavioral paradigms and different design configurations are often not meaningful. However, comparisons within a given paradigm and design configuration can be accurate and generalizable when based on designs with moderate numbers of cues, moderate numbers of responses, and cue sets including diverse words. Ultimately, our results provide insights that help evaluate past findings on the structure of semantic networks and design new studies capable of more reliably revealing individual differences in semantic networks.

Measuring individual semantic networks: A simulation study

Abstract

Accurately capturing individual differences in semantic networks is fundamental to advancing our mechanistic understanding of semantic memory. Past empirical attempts to construct individual-level semantic networks from behavioral paradigms may be limited by data constraints. To assess these limitations and propose improved designs for the measurement of individual semantic networks, we conducted a recovery simulation investigating the psychometric properties underlying estimates of individual semantic networks obtained from two different behavioral paradigms: free associations and relatedness judgment tasks. Our results show that successful inference of semantic networks is achievable, but they also highlight critical challenges. Estimates of absolute network characteristics are severely biased, such that comparisons between behavioral paradigms and different design configurations are often not meaningful. However, comparisons within a given paradigm and design configuration can be accurate and generalizable when based on designs with moderate numbers of cues, moderate numbers of responses, and cue sets including diverse words. Ultimately, our results provide insights that help evaluate past findings on the structure of semantic networks and design new studies capable of more reliably revealing individual differences in semantic networks.

Paper Structure

This paper contains 21 sections, 2 equations, 8 figures, 1 table.

Figures (8)

  • Figure 1: Overview of the recovery simulation setup. The simulation setup consists of four elements. (1) Generating individually different ground-truth semantic networks from a vector-space model (i.e., fastText mikolov_advances_2017). (2) Simulating behavioral data, varying the study design parameters cue set type, cue set size, number of responses, and response type. (3) Inferring networks from the simulated behavioral data with network inference methods matching the response type. (4) Evaluating the network recovery by comparing the inferred networks' measures to those of the ground-truth networks with respect to bias, resolution, and generalizability.
  • Figure 2: Bias. Bias is defined as the ratio between inferred and ground-truth measures. Negative values represent underestimations, whereas positive values represent overestimations. We define an acceptable recovery as bias within $\pm 30\%$ of the ground truth (light gray tiles; $-0.3 \leq \text{bias} \leq 0.3$), compared to underestimations of more than 30% (red tiles; $\text{bias} < -0.3$) and overestimation of more than 30% (blue tiles; $\text{bias} > 0.3$)
  • Figure 3: Resolution. Reported as Spearman correlations ($r$) between measures of the ground-truth networks and inferred networks. Resolution of the recovery is defined as good for values $r \geq .5$ (yellow tiles), positive for values $0 \geq r < .5$ (teal tiles), and negative $r < 0$ (purple tiles). Mean Spearman correlations of inferred metrics and ground truth metrics.
  • Figure 4: Generalizability of bias and resolution Larger font numbers indicate global comparison values, smaller font numbers indicate local comparison values.
  • Figure S1: Free association parameter tuning. (A) Pearson correlations and (B) Spearman rank correlations of the probability distributions of first responses between model free association responses and SWOW free association norms de_deyne_small_2019. Both correlation measures suggest $\gamma_w = 10$ and $\gamma_f = 1$ as best-fitting parameter values with $r_{Pearson} = 0.453$ and $r_{Spearman} = 0.353$. (C) Pearson correlations of first responses analogously to Panels A and B, however, using fixed model parameters $\gamma_w = 10$ and $\gamma_f = 1$, varying semantic network ground truth minimal edge weight. This analysis shows an edge weight cut off of 0.2, as implemented in the study, to perform well in comparison to other cut offs. Free association model (D) first response, (E) second response, and (F) third response word median rank among SWOW norms de_deyne_small_2019 for the same cue word. Panels D, E, and F show the model parameters $\gamma_w = 10$ and $\gamma_f = 1$, as best-fitting values, to generate monotonously increasing median ranks ($Med(R_1) = 2$, $Med(R_2) = 4$, $Med(R_3) = 5$)
  • ...and 3 more figures