Table of Contents
Fetching ...

Lost in Translation: How Does Bilingualism Shape Reader Preferences for Annotated Charts?

Anjana Arunkumar, Lace Padilla, Chris Bryan

TL;DR

This study investigates how bilingual readers’ preferences and comprehension of annotated charts are shaped by annotation density and semantic content across English–Tamil and English–Arabic groups. Using a large-scale five-phase design with six chart types and two stimulus sets ($A$ and $B$), annotations were translated and validated, and relationships were analyzed via structural equation modeling with the WLSMV estimator. Key findings show that English annotations are preferred for dense, information-rich visuals, while native-language full-text annotations maximize comprehension; semantic-depth effects vary by density and language, and linguistic immersion moderates preferences but does not uniformly decide them. The work yields practical guidelines for inclusive multilingual visualization design, including language-switching capabilities, mixed-language annotation strategies, and density-adaptive approaches tailored to chart type and user language experience.

Abstract

Visualizations are powerful tools for conveying information but often rely on accompanying text for essential context and guidance. This study investigates the impact of annotation patterns on reader preferences and comprehension accuracy among multilingual populations, addressing a gap in visualization research. We conducted experiments with two groups fluent in English and either Tamil (n = 557) or Arabic (n = 539) across six visualization types, each varying in annotation volume and semantic content. Full-text annotations yielded the highest comprehension accuracy across all languages, while preferences diverged: English readers favored highly annotated charts, whereas Tamil/Arabic readers preferred full-text or minimally annotated versions. Semantic variations in annotations (L1-L4) did not significantly affect comprehension, demonstrating the robustness of text comprehension across languages. English annotations were generally preferred, with a tendency to think technically in English linked to greater aversion to non-English annotations, though this diminished among participants who regularly switched languages internally. Non-English annotations incorporating visual or external knowledge were less favored, particularly in titles. Our findings highlight cultural and educational factors influencing perceptions of visual information, underscoring the need for inclusive annotation practices for diverse linguistic audiences. All data and materials are available at: https://osf.io/ckdb4/.

Lost in Translation: How Does Bilingualism Shape Reader Preferences for Annotated Charts?

TL;DR

This study investigates how bilingual readers’ preferences and comprehension of annotated charts are shaped by annotation density and semantic content across English–Tamil and English–Arabic groups. Using a large-scale five-phase design with six chart types and two stimulus sets ( and ), annotations were translated and validated, and relationships were analyzed via structural equation modeling with the WLSMV estimator. Key findings show that English annotations are preferred for dense, information-rich visuals, while native-language full-text annotations maximize comprehension; semantic-depth effects vary by density and language, and linguistic immersion moderates preferences but does not uniformly decide them. The work yields practical guidelines for inclusive multilingual visualization design, including language-switching capabilities, mixed-language annotation strategies, and density-adaptive approaches tailored to chart type and user language experience.

Abstract

Visualizations are powerful tools for conveying information but often rely on accompanying text for essential context and guidance. This study investigates the impact of annotation patterns on reader preferences and comprehension accuracy among multilingual populations, addressing a gap in visualization research. We conducted experiments with two groups fluent in English and either Tamil (n = 557) or Arabic (n = 539) across six visualization types, each varying in annotation volume and semantic content. Full-text annotations yielded the highest comprehension accuracy across all languages, while preferences diverged: English readers favored highly annotated charts, whereas Tamil/Arabic readers preferred full-text or minimally annotated versions. Semantic variations in annotations (L1-L4) did not significantly affect comprehension, demonstrating the robustness of text comprehension across languages. English annotations were generally preferred, with a tendency to think technically in English linked to greater aversion to non-English annotations, though this diminished among participants who regularly switched languages internally. Non-English annotations incorporating visual or external knowledge were less favored, particularly in titles. Our findings highlight cultural and educational factors influencing perceptions of visual information, underscoring the need for inclusive annotation practices for diverse linguistic audiences. All data and materials are available at: https://osf.io/ckdb4/.

Paper Structure

This paper contains 27 sections, 12 figures, 6 tables.

Figures (12)

  • Figure 1: 18 charts generated for study phases 1--4 (unannotated), spanning 6 chart types x 3 data shapes. In our main study, participants are shown one data shape per chart type-- i.e., one chart per column from this figure-- to complete study tasks. This was done to reduce the potential biasing effect of data shape.
  • Figure 2: Example bar chart stimuli for Set A, annotated in English, a total of 4 variants. We aimed to capture preferences(RQ1) and comprehension(RQ2) for the extremes between visual and textual presentation of information. (a) Chart presented with no text (beyond axes and ticks). (b) Chart with a title and a single annotation (Title + 1A). (c) Chart which displays a narrative or story around the data, annotated through text (Title: L1 + 3 annotations: L2--L4; Title + 3A). (d) A text-only version of the data, with the same story as displayed in (c).
  • Figure 3: Example bar chart stimuli for Set B (fine-grained comparisons), annotated in English, a total of 11 variants. We aimed to capture preferences(RQ1) and comprehension(RQ2) for different levels of semantic content in chart annotations. We accordingly construct variants such that for different annotation volumes, different combinations of L2, L3, and L4 annotations may be present. (a), (b) Title-Only charts: the main title represents L1 and subtitles are included to incorporate L2--L4 information (total: 4 variants). (c) Title+1A charts: title + a single embedded annotation from L2--L4 (total: 3 variants). (d) Title+2A charts: title + two embedded annotations from L2--L4 (total: 3 variants). (e) Title+3A chart: title + three embedded annotations from L2--L4 (total: 1 variant).
  • Figure 4: Example of the stimulus creation process, based on the identification, ranking, and synthesis of chart/text emphasis features. The initial chart from the article is shown in (a), with the most prominent visual emphasis features highlighted in yellow. (b) represents the corresponding article text, which has textual emphasis features highlighted in teal. In (c), the blank chart created using d3 is shown, with potential annotations for highlighted prominent regions. Red indicates the most prominent region, green the second, and blue the third. The types of annotation positions are outlined in (d). In step (e), an expert designer adjusted the fine details to produce a chart with a realistic layout, corresponding to (f) a summary text paragraph synthesized by the annotators.
  • Figure 5: Example of a bar chart translated into Tamil and Arabic. The table displays the back-translations of Tamil and Arabic into English, for all chart annotations. Note that the x-axis tick marks are transliterated phonetically as they represent the names of airline companies.
  • ...and 7 more figures