Table of Contents
Fetching ...

A High-Resolution, US-scale Digital Similar of Interacting Livestock, Wild Birds, and Human Ecosystems with Applications to Multi-host Epidemic Spread

Abhijin Adiga, Ayush Chopra, Mandy L. Wilson, S. S. Ravi, Dawen Xie, Samarth Swarup, Bryan Lewis, John Barnes, Ramesh Raskar, Madhav V. Marathe

TL;DR

The paper tackles the problem of understanding spillover risk for highly pathogenic avian influenza by building a high‑resolution, national synthetic dataset that jointly represents livestock populations and operations, processing facilities, wild bird abundances, and human demographics. It introduces the digital similar DS, a grid‑based multi‑layer representation for the contiguous US, constructed by fusing diverse data sources through integer linear programming and iterative proportional fitting, and it validates the dataset against independent sources including AgCensus, GLW, and H5N1 incidence data. The authors develop subtype specific, spatiotemporal risk maps using a simple collocation framework R(i,s,t) = P(i,s) · A(i,t) · B_{H5}(i,t), enabling surveillance prioritization and scenario analysis. The DS supports targeted surveillance and control efforts and is designed to be extendable to other pathogens and One Health applications through its modular structure and interactive dashboard DiTTO. The work demonstrates strong predictive alignment with historical outbreaks for multiple livestock subtypes and highlights persistent high‑risk regions and seasonal risk patterns, offering a practical tool for policymakers and researchers in disease ecology, agriculture, and public health.

Abstract

One Health issues, such as the spread of highly pathogenic avian influenza~(HPAI), present significant challenges at the human-animal-environmental interface. Recent H5N1 outbreaks underscore the need for comprehensive modeling efforts that capture the complex interactions between various entities in these interconnected ecosystems. To support such efforts, we develop a methodology to construct a synthetic spatiotemporal gridded dataset of livestock production and processing, human population, and wild birds for the contiguous United States, called a \emph{digital similar}. This representation is a result of fusing diverse datasets using statistical and optimization techniques, followed by extensive verification and validation. The livestock component includes farm-level representations of four major livestock types -- cattle, poultry, swine, and sheep -- including further categorization into subtypes such as dairy cows, beef cows, chickens, turkeys, ducks, etc. Weekly abundance data for wild bird species identified in the transmission of avian influenza are included. Gridded distributions of the human population, along with demographic and occupational features, capture the placement of agricultural workers and the general population. We demonstrate how the digital similar can be applied to evaluate spillover risk to dairy cows and poultry from wild bird population, then validate these results using historical H5N1 incidences. The resulting subtype-specific spatiotemporal risk maps identify hotspots of high risk from H5N1 infected wild bird population to dairy cattle and poultry operations, thus guiding surveillance efforts.

A High-Resolution, US-scale Digital Similar of Interacting Livestock, Wild Birds, and Human Ecosystems with Applications to Multi-host Epidemic Spread

TL;DR

The paper tackles the problem of understanding spillover risk for highly pathogenic avian influenza by building a high‑resolution, national synthetic dataset that jointly represents livestock populations and operations, processing facilities, wild bird abundances, and human demographics. It introduces the digital similar DS, a grid‑based multi‑layer representation for the contiguous US, constructed by fusing diverse data sources through integer linear programming and iterative proportional fitting, and it validates the dataset against independent sources including AgCensus, GLW, and H5N1 incidence data. The authors develop subtype specific, spatiotemporal risk maps using a simple collocation framework R(i,s,t) = P(i,s) · A(i,t) · B_{H5}(i,t), enabling surveillance prioritization and scenario analysis. The DS supports targeted surveillance and control efforts and is designed to be extendable to other pathogens and One Health applications through its modular structure and interactive dashboard DiTTO. The work demonstrates strong predictive alignment with historical outbreaks for multiple livestock subtypes and highlights persistent high‑risk regions and seasonal risk patterns, offering a practical tool for policymakers and researchers in disease ecology, agriculture, and public health.

Abstract

One Health issues, such as the spread of highly pathogenic avian influenza~(HPAI), present significant challenges at the human-animal-environmental interface. Recent H5N1 outbreaks underscore the need for comprehensive modeling efforts that capture the complex interactions between various entities in these interconnected ecosystems. To support such efforts, we develop a methodology to construct a synthetic spatiotemporal gridded dataset of livestock production and processing, human population, and wild birds for the contiguous United States, called a \emph{digital similar}. This representation is a result of fusing diverse datasets using statistical and optimization techniques, followed by extensive verification and validation. The livestock component includes farm-level representations of four major livestock types -- cattle, poultry, swine, and sheep -- including further categorization into subtypes such as dairy cows, beef cows, chickens, turkeys, ducks, etc. Weekly abundance data for wild bird species identified in the transmission of avian influenza are included. Gridded distributions of the human population, along with demographic and occupational features, capture the placement of agricultural workers and the general population. We demonstrate how the digital similar can be applied to evaluate spillover risk to dairy cows and poultry from wild bird population, then validate these results using historical H5N1 incidences. The resulting subtype-specific spatiotemporal risk maps identify hotspots of high risk from H5N1 infected wild bird population to dairy cattle and poultry operations, thus guiding surveillance efforts.

Paper Structure

This paper contains 58 sections, 2 equations, 17 figures, 5 tables, 3 algorithms.

Figures (17)

  • Figure 1: Overview of the digital similar. A schematic of the system highlighting the various components and the dashboard through which the data is exposed is provided at the top. Two of the four livestock layers are shown. We have zoomed in on major production regions for the respective livestock. Both population density and counts of farms are depicted. Also shown are livestock and dairy processing centers. For the human population, agricultural workers are highlighted. The spatiotemporal distribution of three wild bird populations is shown in the bottom layer.
  • Figure 2: A schematic of the livestock layer construction, including data processing, generation of farms and assignment of cells to farms. It takes as input AgCensus data that comprises head and farm counts at various administrative levels and the gridded distribution of the livestock populations from GLW. FillGaps is an integer program that fills gaps in the census data. GenFarms is an integer linear program (ILP) for distributing the livestock populations to farms consistent with the census data. FarmsToCells is an ILP that assigns farms to grid cells with the objective of aligning the population with GLW.
  • Figure 3: Alignment of the livestock layers with AgCensus and GLW datasets. (a) Head counts of assigned farms are compared with AgCensus. We have a plot for each livestock type with subtypes on the x-axis and percentage mean-normalized absolute relative difference of state totals from the census and $\mathcal{DS}$ on the y-axis. (b) The distribution of livestock populations among farms ordered by farm size. The y-axis corresponds to farm size. Separate plots for subtypes are shown for cattle and poultry.
  • Figure 4: (a) Analysis of the GenFarms algorithm: We plot the value of the parameter $\lambda_1$ (see supplement) relative to the number of heads. This parameter is the maximum absolute difference between the number of heads in AgCensus to that in the generated farms at the county level for each farm size category. Lower is better. Each value in the box plot corresponds to a county. We also plot this value for a restricted set of instances where the county totals are known. (b) Analysis of the FarmsToCells algorithm: Both plots indicate the agreement of the farm assignment with GLW data. The first plot corresponds to parameter $\lambda_5$ (see supplement) for county--livestock instances relative to the number of heads. This parameter is the maximum absolute difference between assigned head counts in a cell and its corresponding GLW value. Lower is better. Using the Pearson correlation coefficient, cell-level head counts aggregated from our farm assignment are compared with GLW head counts; higher is better.
  • Figure 5: Validation of components: (a) Analysis of mapping CAFO locations by livestock type to farms from the digital similar. Farms were chosen based on the thresholds stated in the title. The first subplot shows how many CAFO locations were matched. The supplement has an additional plot for a different set of thresholds. The second subplot provides the cumulative distribution of the distances (in miles) between matched pairs of CAFO locations and farms. An additional plot for a different set of thresholds is in the supplement. (b) Analysis of H5N1 cases and bird abundance for the period of January to December of 2023. The plot summarizes the results across eight reporting states for the four quarters. The x-axis corresponds to the number of top species groups considered, while the y-axis corresponds to the count of those groups with H5N1 incidence.
  • ...and 12 more figures