Table of Contents
Fetching ...

Graph Learning for Bidirectional Disease Contact Tracing on Real Human Mobility Data

Sofia Hurtado, Radu Marculescu

TL;DR

Bidirectional contact tracing can reduce infectious effective reproduction rate by 71%, thus significantly controlling the outbreak, and a new Infectious Path Centrality network metric is introduced that informs a graph learning edge classifier to identify important transmission events.

Abstract

For rapidly spreading diseases where many cases show no symptoms, swift and effective contact tracing is essential. While exposure notification applications provide alerts on potential exposures, a fully automated system is needed to track the infectious transmission routes. To this end, our research leverages large-scale contact networks from real human mobility data to identify the path of transmission. More precisely, we introduce a new Infectious Path Centrality network metric that informs a graph learning edge classifier to identify important transmission events, achieving an F1-score of 94%. Additionally, we explore bidirectional contact tracing, which quarantines individuals both retroactively and proactively, and compare its effectiveness against traditional forward tracing, which only isolates individuals after testing positive. Our results indicate that when only 30% of symptomatic individuals are tested, bidirectional tracing can reduce infectious effective reproduction rate by 71%, thus significantly controlling the outbreak.

Graph Learning for Bidirectional Disease Contact Tracing on Real Human Mobility Data

TL;DR

Bidirectional contact tracing can reduce infectious effective reproduction rate by 71%, thus significantly controlling the outbreak, and a new Infectious Path Centrality network metric is introduced that informs a graph learning edge classifier to identify important transmission events.

Abstract

For rapidly spreading diseases where many cases show no symptoms, swift and effective contact tracing is essential. While exposure notification applications provide alerts on potential exposures, a fully automated system is needed to track the infectious transmission routes. To this end, our research leverages large-scale contact networks from real human mobility data to identify the path of transmission. More precisely, we introduce a new Infectious Path Centrality network metric that informs a graph learning edge classifier to identify important transmission events, achieving an F1-score of 94%. Additionally, we explore bidirectional contact tracing, which quarantines individuals both retroactively and proactively, and compare its effectiveness against traditional forward tracing, which only isolates individuals after testing positive. Our results indicate that when only 30% of symptomatic individuals are tested, bidirectional tracing can reduce infectious effective reproduction rate by 71%, thus significantly controlling the outbreak.

Paper Structure

This paper contains 16 sections, 4 equations, 10 figures, 1 table.

Figures (10)

  • Figure 1: Manual contact tracing involves collecting past interactions to construct a directed acyclic graph (DAG), where parent nodes are potential sources of infection for their child nodes (forming a contact tracing network). When identifying superspreaders, potential infections, or vaccination candidates, most studies use network analysis techniques such as betweenness centrality on networks with static interactions but dynamic node labels (i.e., health status). However, as illustrated in the contact networks from Day 1 and Day 2 (left), nodes with the highest betweenness centrality newman_centrality do not necessarily hold significant roles in the contact tracing network (right). Instead, nodes with the highest value of our proposed metric, Infectious Path Centrality—which measures the number of paths connecting two positive leaf nodes—are often the most recent common ancestors, making them (and their offspring) crucial for targeted quarantines. We evaluate our metric by comparing its effectiveness in a bidirectional graph learning mitigation framework, which uses this new transmission network metric to identify and quarantine unseen branches of the disease, against traditional forward contact tracing that quarantines those who test positive.
  • Figure 2: Our experiment starts by processing Foursquare mobility data containing device dwell times at POIs into person-to-person contact networks (step 1). Two devices (i.e., people) are connected when they visit the same location within the same hour. We then apply an agent-based epidemiological SEIR (Susceptible-Exposed-Infectious-Recovered) model rSEIR on the dynamic contact networks (step 2). To keep track of transmission events, we form a contact tracing network where the parent is a potential source of infection to the child (step 3). For every transmission event (infection in step (2)), we add the infectious interaction to the contact tracing network, as well as all other interactions the infectee has on the day of infection. This mimics the manual contact tracing where someone recalls all of their interactions on the day they got infected. Step 4 consists of calculating our proposed Infectious Path Centrality metric that is then used as features in edge classification (step 5). The graph learning module classifies the edges in the contact tracing network as being 'infectious' (i.e., a true transmission event), or 'non-infectious). Finally, we test the efficacy of our approach by comparing a population seeded with the same infectious individuals that undergoes no mitigation, forward contact tracing, and bidirectional contact tracing using our mitigation framework (step 6).
  • Figure 3: (a) Notation for equations 2-4. (b) Toy example of a contact tracing network using the notation for our proposed Infectious Path Centrality metric. Nodes $u$ and $z$ just tested positive and are tracing past interactions to identify who is the likely source of infection between nodes $v$ and $i$; we assume node $y$ infected node $z$. $H$ denotes the number of hops (i.e., depth), the Infectious Path Centrality encompasses. $\alpha$ denotes a propagation decay constant that facilitates calculating the weight $w$ of each edge in the $N_h$ ($h$-hop neighborhood). Note that $w$ acts as an attenuating signal originating from node $u$ that sub-samples the larger contact tracing network. The term $\phi_y$ represents the total weights of incoming edges of node $y$ from all paths originating from infectious leaf nodes (i.e., nodes $u$ and $z$). The Infectious Path Centrality $\pi_v$ then quantifies the forward accumulation of all $\phi$ values leading back to $u$'s immediate neighbors (i.e., $N_1(u)$). (c) Example calculations for $w_{(u,v)}$, $\phi_y$, and $\pi_v$. (d) Step-by-step process of calculating Infectious centrality where 1) we input the contact tracing network and 2) reverse the edges to attenuate the weights down to the maximum hop level $H$. 3) Each node accumulates weights $w$ from all incoming edges to get $\phi$. 4) We then restore original edge directions and accumulate $\phi$ back to the first-hop neighbors of the leaf nodes. 5) Next, we add all the $\phi$ from the incoming edges into the first-hop neighbors to create the their respective Infectious Path Centrality value $\pi$. 6) Finally, we normalize all $\pi$ values and use them as features in the edge classification module. By design, if a node exists on paths leading to multiple infectious leafs, the $\pi$ value will be greater. We hypothesize that the (orange) node with the maximum $\pi$ is likely the infectious source.
  • Figure 4: (a) 500 node sub-sample of a contact tracing network. Red edges signify a transmission event where the parent node infects the child node. (b) portrays an ego-network view of a leaf node that has recently tested positive. Their ego-network consists of all contacts made on the day of infection (incoming edges), as well as those they have infected (outgoing red edges). (c) depicts a zoomed in view of an infectious node where there is only one incoming red edge (i.e., source of infection), and many outgoing red edges (parent of infections). There are also many incoming grey edges that signify interactions on the day of infection that were not transmission events.
  • Figure 5: The in-degree histogram in log-log scale shows that the contact tracing network is scale-free. This means that few nodes have many incoming edges, while most nodes have few. The class imbalance for identifying the incoming transmission edge is proportional to the in-degree which makes training an edge classifier largely unbalanced.
  • ...and 5 more figures