Utilizing Dynamic Time Warping for Pandemic Surveillance: Understanding the Relationship between Google Trends Network Metrics and COVID-19 Incidences
Michael T. Lopez, Cheska Elise Hung, Maria Regina Justina E. Estuar
TL;DR
The paper addresses whether Google Trends-derived network metrics can serve as early signals of COVID-19 spread in Metro Manila by applying Dynamic Time Warping (DTW) to align network statistics from 15 keywords’ RSV with COVID-19 case trajectories. It constructs time-varying networks using two preprocessing methods and four correlation thresholds, evaluating 320 configurations across DTW window sizes and Sakoe-Chiba radii, to quantify alignment with confirmed and active cases. Five of six parameters significantly influence alignment, with disease-case type and radius playing dominant roles; the best overall configuration uses network density with the Rescaling Daily Data method, a threshold of $0.8$, a 15-day window, and a radius of $50$ days, achieving DTW $=36.30$ with confirmed cases. The results support the potential of GT-based network metrics as complementary epidemic surveillance tools in the Philippines, offering early signals in settings with high online activity and informing strategic communication and public health decisions. The study also highlights the importance of parameter choices, particularly the Sakoe-Chiba radius and data preprocessing method, for operational deployment of search-behavior–driven surveillance systems.
Abstract
The premise of network statistics derived from Google Trends data to foresee COVID-19 disease progression is gaining momentum in infodemiology. This approach was applied in Metro Manila, National Capital Region, Philippines. Through dynamic time warping (DTW), the temporal alignment was quantified between network metrics and COVID-19 case trajectories, and systematically explored 320 parameter configurations including two network metrics (network density and clustering coefficient), two data preprocessing methods (Rescaling Daily Data and MSV), multiple thresholds, two correlation window sizes, and Sakoe-Chiba band constraints. Results from the Kruskal-Wallis tests revealed that five of the six parameters significantly influenced alignment quality, with the disease comparison type (active cases vs. confirmed cases) demonstrating the strongest effect. The optimal configuration, which is using the network density statistic with a Rescaling Daily Data transformation, a threshold of 0.8, a 15-day window, and a 50-day radius constraint, achieved a DTW score of 36.30. This indicated substantial temporal alignment with the COVID-19 confirmed cases data. The discoveries demonstrate that network metrics rooted from online search behavior can serve as complementary indicators for epidemic surveillance in urban locations like Metro Manila. This strategy leverages the Philippines' extensive online usage during the pandemic to provide potentially valuable early signals of disease spread, and offers a supplementary tool for public health monitoring in resource-limited situations.
