Table of Contents
Fetching ...

Social inequality and cultural factors impact the awareness and reaction during the cryptic transmission period of pandemic

Zhuoren Jiang, Xiaozhong Liu, Yangyang Kang, Changlong Sun, Yong-Yeol Ahn, Johan Bollen

TL;DR

This study investigates how awareness of an emerging pandemic diffused through Chinese society during the cryptic transmission period using a massive Alibaba e-commerce dataset (46.5B queries and 150B records across 88 days for 94M individuals). It shows that diffusion was shaped by geography, education, income, and social ties, with earlier signals concentrated near the epicenter and stronger diffusion through family and school networks; cultural factors such as social tightness also modulated diffusion, revealing persistent inequities. The authors develop time-evolving logistic regression models to predict awareness and identify phase-specific profiles of typical aware individuals. The findings contribute to diffusion theory and public health strategy by highlighting how to leverage social networks and e-commerce signals to address information gaps and target interventions for disadvantaged groups during future pandemics.

Abstract

The World Health Organization (WHO) declared the COVID-19 outbreak a Public Health Emergency of International Concern (PHEIC) on January 31, 2020. However, rumors of a "mysterious virus" had already been circulating in China in December 2019, possibly preceding the first confirmed COVID-19 case. Understanding how awareness about an emerging pandemic spreads through society is vital not only for enhancing disease surveillance, but also for mitigating demand shocks and social inequities, such as shortages of personal protective equipment (PPE) and essential supplies. Here we leverage a massive e-commerce dataset comprising 150 billion online queries and purchase records from 94 million people to detect the traces of early awareness and public response during the cryptic transmission period of COVID-19. Our analysis focuses on identifying information gaps across different demographic cohorts, revealing significant social inequities and the role of cultural factors in shaping awareness diffusion and response behaviors. By modeling awareness diffusion in heterogeneous social networks and analyzing online shopping behavior, we uncover the evolving characteristics of vulnerable populations. Our findings expand the theoretical understanding of awareness spread and social inequality in the early stages of a pandemic, highlighting the critical importance of e-commerce data and social network data in effectively and timely addressing future pandemic challenges. We also provide actionable recommendations to better manage and mitigate dynamic social inequalities in public health crises.

Social inequality and cultural factors impact the awareness and reaction during the cryptic transmission period of pandemic

TL;DR

This study investigates how awareness of an emerging pandemic diffused through Chinese society during the cryptic transmission period using a massive Alibaba e-commerce dataset (46.5B queries and 150B records across 88 days for 94M individuals). It shows that diffusion was shaped by geography, education, income, and social ties, with earlier signals concentrated near the epicenter and stronger diffusion through family and school networks; cultural factors such as social tightness also modulated diffusion, revealing persistent inequities. The authors develop time-evolving logistic regression models to predict awareness and identify phase-specific profiles of typical aware individuals. The findings contribute to diffusion theory and public health strategy by highlighting how to leverage social networks and e-commerce signals to address information gaps and target interventions for disadvantaged groups during future pandemics.

Abstract

The World Health Organization (WHO) declared the COVID-19 outbreak a Public Health Emergency of International Concern (PHEIC) on January 31, 2020. However, rumors of a "mysterious virus" had already been circulating in China in December 2019, possibly preceding the first confirmed COVID-19 case. Understanding how awareness about an emerging pandemic spreads through society is vital not only for enhancing disease surveillance, but also for mitigating demand shocks and social inequities, such as shortages of personal protective equipment (PPE) and essential supplies. Here we leverage a massive e-commerce dataset comprising 150 billion online queries and purchase records from 94 million people to detect the traces of early awareness and public response during the cryptic transmission period of COVID-19. Our analysis focuses on identifying information gaps across different demographic cohorts, revealing significant social inequities and the role of cultural factors in shaping awareness diffusion and response behaviors. By modeling awareness diffusion in heterogeneous social networks and analyzing online shopping behavior, we uncover the evolving characteristics of vulnerable populations. Our findings expand the theoretical understanding of awareness spread and social inequality in the early stages of a pandemic, highlighting the critical importance of e-commerce data and social network data in effectively and timely addressing future pandemic challenges. We also provide actionable recommendations to better manage and mitigate dynamic social inequalities in public health crises.

Paper Structure

This paper contains 22 sections, 5 figures, 1 table.

Figures (5)

  • Figure 1: The Patterns of Awareness Diffusion (5 Phases) (a) The diffusion of awareness and reaction. The red Y-axis on the left represents the daily growth in the aware population (red dotted line), and the blue Y-axis on the right corresponds to the daily cumulative aware populations (blue dash-dotted line). There are two peaks on the daily trends: 9,289,545 newly aware on 01/23/2020 (Wuhan lockdown) and 11,655,320 newly aware on 01/25/2020 (30 provincial-level regions activated first-level public health emergency). (b) The geographic awareness distributions (366 cities) on four representative days of different phases. The size of the circle indicates aware population, the color indicates the awareness percentage, the triangle represents the epicenter, and the squares represent the city with most aware individuals. The initial awareness surged from the epicenter Wuhan (in the beginning phase), and gradually spread across the whole country with the increasing in pandemic severity. (c) The awareness percentage trends of different education groups (for four representative days of different phases). In the beginning phase, the group with graduate degrees led a higher aware percentage after Wuhan MHC released a pneumonia outbreak briefing (12/31/2019). The similar trends can be observed in growth phase when China NHC confirmed human-to-human transmission (01/20/2020). During the peak phase (after Wuhan lockdown), the aware percentages of graduate and bachelor groups were close and higher than the college or lower group. This trend continued during the post-peak phase. (d) Neighborhood awareness ratio (between aware individuals' aware neighbor percentage and unaware individuals' aware neighbor percentage) following three types of social relations. In the beginning phase, all three social relations can diffuse pandemic information efficiently (ratios greater than 8), while family relation showed the highest diffusion efficiency (ratios $\in \left [ 63.9, 128.7 \right ]$). After the growth phase, it is hard to differentiate aware and unaware individuals' neighbors (ratio converges to 1). The basemap used in Figure 1 is sourced from the National Platform for Common Geospatial Information Service, China (Map Approval Number: GS (2024) 0650) and complies with Chinese laws and regulations. This map is for illustrative purposes only, used to visualize research data without any political or territorial assertions.
  • Figure 2: The Awareness Trends across Different Demographic Groups with Important News Events. Each line represents cross-group awareness ratio ($R=P_{G_{1}}/P_{G_{2}}$. $R$ is the cross-group awareness ratio, and $P_{G_{i}}$ is the percentage of aware people in the group $G_{i}$). When the first official pandemic briefing released (12/31/2019), females, with-children, and unmarried groups reacted more quickly (cross-group awareness ratio trend-lines dropped). After strict screening tests were activated in Wuhan (01/16/2020), females, with-children, and unmarried groups showed stronger awareness strengths (awareness ratio trend-lines dropped and kept declining). After the NHC confirmed human-to-human transmission (01/20/2020), male and without-children groups began to show significant awareness strengths (awareness ratio trend-lines began to rise). It was not until the Wuhan lockdown (01/23/2020) that the married group began to show a relatively stronger level of awareness compared to the unmarried group (married-aware/unmarried-aware ratio trend-line began to rise). A base-10 log scale is applied for the Y axis.
  • Figure 3: Awareness (percentage) Patterns for Different Occupation Groups (upper sub-plot); a base-10 log scale is applied for the Y-axis; the left cut-out zooms in on details for 5 days around the Wuhan lockdown (01/23/2020); the right cut-out zooms in on details for 7 days around WHO declared the new coronavirus outbreak (01/31/2020) in the post-peak phase. The hospital staff kept the highest awareness percentage (0.16%-0.45%) in the whole beginning phase. In the growth phase, the education/research group surpassed hospital staff and became the most aware group (0.39%-25%), while agriculture forestry animal-husbandry and fishery were the least group (0.16%-10.09%). The peak phase showed a similar pattern; education/research was the most aware group (37.24%-66.25%) while agriculture forestry animal-husbandry and fishery were the least one (16.86%-44.03%). The gaps of different occupation groups shrank during the post-peak phase. The lower sub-plot visualizes four representative days of different phases, and the Y-axis is the average purchasing power of the aware population from different occupation groups. Results show that high-income people respond to the emerging pandemic more quickly than low-income people.
  • Figure 4: The Trends of Geography-related Spearman's Rank Correlation Coefficients: Distance to Epicenter (Wuhan) vs. Awareness Percentage (for 366 major cities); Confirmed COVID-19 Cases vs. Awareness Percentages, GDPs vs. Awareness Percentages, Cultural Tightness vs. Awareness Percentages, Paddy Rice Percentages vs. Awareness Percentages, Technology Innovation Indexes vs. Awareness Percentages, Illiterate Population Proportions vs. Awareness Percentages, Multi-ethnic Household Percentages vs. Awareness Percentages (31 provinces). Following the beginning phase, factors such as the distance from the epicenter, the proportion of the illiterate population, and the percentage of multi-ethnic households exhibit a negative correlation with awareness. In contrast, the number of confirmed COVID-19 cases, GDP, percentage of paddy rice, and technological innovation index show a positive correlation with awareness.
  • Figure 5: Logistic Regression Models (left), at different awareness percentage points, visualize the time-evolving trends of odds ratios of demographic and social relation features. X-axis is aware percentages and Y-axis is the odds ratios of variables. The typical characteristics of an aware individual (right) across the different phases.