Table of Contents
Fetching ...

CommentScope: A Comment-Embedded Assisted Reading System for a Long Text

Shuai Chen, Lei Han, Haoyu Wang, Zhaoman Zhong

TL;DR

CommentScope tackles information overload in long-form reading by embedding social comments directly into the text. It introduces a two-part Rule+LLM pipeline to semantically classify comments and anchor them to exact text locations, paired with a multi-level, visually rich interface that shows inline, paragraph, and global comment contexts. Technical evaluation shows high semantic and location accuracy with efficiency gains, and a user study demonstrates improved reading speed, comprehension, and reduced cognitive load compared to traditional end-of-article comment listings. The work offers a practical, scalable approach to augment social reading, with clear paths for accessibility, multimodal content, and cross-platform deployment.

Abstract

Long texts are ubiquitous on social platforms, yet readers often face information overload and struggle to locate key content. Comments provide valuable external perspectives for understanding, questioning, and complementing the text, but their potential is hindered by disorganized and unstructured presentation. Few studies have explored embedding comments directly into reading. As an exploratory step, we propose CommentScope, a system with two core modules: a pipeline that classifies comments into five types and aligns them with relevant sentences, and a presentation module that integrates comments inline or as side notes, supported by visual cues such as colors, charts, and highlights. Technical evaluation shows that the hybrid "Rule+LLM" pipeline achieved solid performance in semantic classification (accuracy=0.90) and position alignment (accuracy=0.88). A user study (N=12) further demonstrated that the sentence-end embedding significantly improved comment discovery accuracy and reading fluency while reducing mental demand and perceived effort.

CommentScope: A Comment-Embedded Assisted Reading System for a Long Text

TL;DR

CommentScope tackles information overload in long-form reading by embedding social comments directly into the text. It introduces a two-part Rule+LLM pipeline to semantically classify comments and anchor them to exact text locations, paired with a multi-level, visually rich interface that shows inline, paragraph, and global comment contexts. Technical evaluation shows high semantic and location accuracy with efficiency gains, and a user study demonstrates improved reading speed, comprehension, and reduced cognitive load compared to traditional end-of-article comment listings. The work offers a practical, scalable approach to augment social reading, with clear paths for accessibility, multimodal content, and cross-platform deployment.

Abstract

Long texts are ubiquitous on social platforms, yet readers often face information overload and struggle to locate key content. Comments provide valuable external perspectives for understanding, questioning, and complementing the text, but their potential is hindered by disorganized and unstructured presentation. Few studies have explored embedding comments directly into reading. As an exploratory step, we propose CommentScope, a system with two core modules: a pipeline that classifies comments into five types and aligns them with relevant sentences, and a presentation module that integrates comments inline or as side notes, supported by visual cues such as colors, charts, and highlights. Technical evaluation shows that the hybrid "Rule+LLM" pipeline achieved solid performance in semantic classification (accuracy=0.90) and position alignment (accuracy=0.88). A user study (N=12) further demonstrated that the sentence-end embedding significantly improved comment discovery accuracy and reading fluency while reducing mental demand and perceived effort.

Paper Structure

This paper contains 53 sections, 9 figures, 4 tables.

Figures (9)

  • Figure 1: Four comment layouts and their design inspirations. The left side shows the sources of inspiration, and the right side illustrates the comment layouts: (a) Text-End Embedding, (b) Sentence-End Embedding, (c) Between-Line Embedding, and (d) Click-to-Show.
  • Figure 2: Overview of the system interface. A: Top control panel with display toggles, quantitative filters, and classification legends. B: Left-side global comment panel showing comments related to the overall article. C: Main article content area with sentence-end embedding, serving as the core reading and interaction space. D: Right-side sentence-level comment panel, appearing dynamically when a user selects a sentence, displaying detailed comments linked to that sentence.
  • Figure 3: Paragraph-level comment view, showing the end-of-paragraph button, comment count markers, and collapsed/expanded states.
  • Figure 4: Pipeline of semantic and positional classification in CommentScope. The pipeline consists of three core steps: (1) Input Dataset $\rightarrow$ Rule Filtering: Comments and articles are used as input. Rules for semantic classification (symbol matching, keyword matching, semantic matching) and location classification (position indicator word matching, original citation matching, entity matching) provide initial filtering. (2) Rule Filtering $\rightarrow$ LLM Semantic Judgment: Samples not covered by rules are classified by a large language model (LLM), while rule-identified samples are further verified. (3) LLM Semantic Judgment $\rightarrow$ Output: The system produces semantic categories (statement, question, exclamation, suggestion, sarcasm) and location categories (sentence-level, paragraph-level, global-level)
  • Figure 5: Four comment embedding interface designs: Baseline (BL), Click-to-Show (CS), Sentence-End Embedding (SE), and Between-Line Embedding (BE).
  • ...and 4 more figures