Table of Contents
Fetching ...

Beyond the Lens: Quantifying the Impact of Scientific Documentaries through Amazon Reviews

Jill Naiman, Aria Pessianzadeh, Hanyu Zhao, AJ Christensen, Kalina Borkiewicz, Shriya Srikanth, Anushka Gami, Emma Maxwell, Louisa Zhang, Sri Nithya Yeragorla, Rezvaneh Rezapour

TL;DR

The paper tackles the challenge of quantifying the public impact of scientific documentaries by analyzing Amazon reviews through a novel taxonomy of impact and sentiment at the sentence level. It contributes a labeled dataset of 1296 sentences from six AVL-produced documentaries and develops a robust ML/LLM framework to classify sentiment and impact, validating the approach with a Hubble evaluation set. Transformer and large language models, especially when given full-context prompts or fine-tuned on task data, achieve strong performance and generalize across documentary domains. The work demonstrates that quantitative, scalable analysis of audience responses can reveal how documentaries engage cognition, attitudes, and interests, informing science communication practice and future media design. It also provides a foundation for cross-platform generalization and longitudinal assessment of documentary impact.

Abstract

Engaging the public with science is critical for a well-informed population. A popular method of scientific communication is documentaries. Once released, it can be difficult to assess the impact of such works on a large scale, due to the overhead required for in-depth audience feedback studies. In what follows, we overview our complementary approach to qualitative studies through quantitative impact and sentiment analysis of Amazon reviews for several scientific documentaries. In addition to developing a novel impact category taxonomy for this analysis, we release a dataset containing 1296 human-annotated sentences from 1043 Amazon reviews for six movies created in whole or part by the Advanced Visualization Lab (AVL). This interdisciplinary team is housed at the National Center for Supercomputing Applications and consists of visualization designers who focus on cinematic presentations of scientific data. Using this data, we train and evaluate several machine learning and large language models, discussing their effectiveness and possible generalizability for documentaries beyond those focused on for this work. Themes are also extracted from our annotated dataset which, along with our large language model analysis, demonstrate a measure of the ability of scientific documentaries to engage with the public.

Beyond the Lens: Quantifying the Impact of Scientific Documentaries through Amazon Reviews

TL;DR

The paper tackles the challenge of quantifying the public impact of scientific documentaries by analyzing Amazon reviews through a novel taxonomy of impact and sentiment at the sentence level. It contributes a labeled dataset of 1296 sentences from six AVL-produced documentaries and develops a robust ML/LLM framework to classify sentiment and impact, validating the approach with a Hubble evaluation set. Transformer and large language models, especially when given full-context prompts or fine-tuned on task data, achieve strong performance and generalize across documentary domains. The work demonstrates that quantitative, scalable analysis of audience responses can reveal how documentaries engage cognition, attitudes, and interests, informing science communication practice and future media design. It also provides a foundation for cross-platform generalization and longitudinal assessment of documentary impact.

Abstract

Engaging the public with science is critical for a well-informed population. A popular method of scientific communication is documentaries. Once released, it can be difficult to assess the impact of such works on a large scale, due to the overhead required for in-depth audience feedback studies. In what follows, we overview our complementary approach to qualitative studies through quantitative impact and sentiment analysis of Amazon reviews for several scientific documentaries. In addition to developing a novel impact category taxonomy for this analysis, we release a dataset containing 1296 human-annotated sentences from 1043 Amazon reviews for six movies created in whole or part by the Advanced Visualization Lab (AVL). This interdisciplinary team is housed at the National Center for Supercomputing Applications and consists of visualization designers who focus on cinematic presentations of scientific data. Using this data, we train and evaluate several machine learning and large language models, discussing their effectiveness and possible generalizability for documentaries beyond those focused on for this work. Themes are also extracted from our annotated dataset which, along with our large language model analysis, demonstrate a measure of the ability of scientific documentaries to engage with the public.

Paper Structure

This paper contains 25 sections, 3 figures, 7 tables.

Figures (3)

  • Figure 1: Breakdown of full 1286 sentences in the annotated dataset as described in the "Data" section. Numbers are percentages of the total sentences in the dataset within each Sentiment and Impact combination.
  • Figure 2: First stage of annotation process in which users are instructed to select their first choice for the sentiment category for each sentence. Each sentence (top) is shown in the context of the full review (bottom). The annotator moves to the next stage of annotation with the green "Done" button (or blue "Done & Talk" button to post comments in the project forum). More information (comment URL and film title) is accessed with the circled "i" Info button and annotation can be saved in the user's personal collection for later reference with the heart Favorite button.
  • Figure 3: Prompt used for prediction of impact categories