Table of Contents
Fetching ...

Stochastic Dynamic Network Utility Maximization with Application to Disaster Response

Anna Scaglione, Nurullah Karakoc

TL;DR

The paper addresses resource allocation under stochastic dynamics in large-scale disaster response by formulating a stochastic dynamic NUM problem over multiple local MDPs tied by a global resource cap. It solves this via a distributed primal-dual approach: local subproblems F_l(y^l) are solved with deep reinforcement learning on agent-based simulations, while a central layer uses dual prices to coordinate allocations. To enable online tractability, the authors introduce a concave, non-decreasing interpolation hat F_l(y) from samples and prove an optimality-gap bound. The methodology is validated through two case studies—pandemic vaccine distribution and wildfire firefighting—demonstrating rolling-horizon reallocation that adapts to ground data and forecasts. The work provides a practical, scalable framework for ICS-style disaster response that blends DRL-based local optimization with market-like global coordination.

Abstract

In this paper, we are interested in solving Network Utility Maximization (NUM) problems whose underlying local utilities and constraints depend on a complex stochastic dynamic environment. While the general model applies broadly, this work is motivated by resource sharing during disasters concurrently occurring in multiple areas. In such situations, hierarchical layers of Incident Command Systems (ICS) are engaged; specifically, a central entity (e.g., the federal government) typically coordinates the incident response allocating resources to different sites, which then get distributed to the affected by local entities. The benefits of an allocation decision to the different sites are generally not expressed explicitly as a closed-form utility function because of the complexity of the response and the random nature of the underlying phenomenon we try to contain. We use the classic approach of decomposing the NUM formulation and applying a primal-dual algorithm to achieve optimal higher-level decisions under coupled constraints while modeling the optimized response to the local dynamics with deep reinforcement learning algorithms. The decomposition we propose has several benefits: 1) the entities respond to their local utilities based on a congestion signal conveyed by the ICS upper layers; 2) the complexity of capturing the utility of local responses and their diversity is addressed effectively without sharing local parameters and priorities with the ICS layers above; 3) utilities, known as explicit functions, are approximated as convex functions of the resources allocated; 4) decisions rely on up-to-date data from the ground along with future forecasts.

Stochastic Dynamic Network Utility Maximization with Application to Disaster Response

TL;DR

The paper addresses resource allocation under stochastic dynamics in large-scale disaster response by formulating a stochastic dynamic NUM problem over multiple local MDPs tied by a global resource cap. It solves this via a distributed primal-dual approach: local subproblems F_l(y^l) are solved with deep reinforcement learning on agent-based simulations, while a central layer uses dual prices to coordinate allocations. To enable online tractability, the authors introduce a concave, non-decreasing interpolation hat F_l(y) from samples and prove an optimality-gap bound. The methodology is validated through two case studies—pandemic vaccine distribution and wildfire firefighting—demonstrating rolling-horizon reallocation that adapts to ground data and forecasts. The work provides a practical, scalable framework for ICS-style disaster response that blends DRL-based local optimization with market-like global coordination.

Abstract

In this paper, we are interested in solving Network Utility Maximization (NUM) problems whose underlying local utilities and constraints depend on a complex stochastic dynamic environment. While the general model applies broadly, this work is motivated by resource sharing during disasters concurrently occurring in multiple areas. In such situations, hierarchical layers of Incident Command Systems (ICS) are engaged; specifically, a central entity (e.g., the federal government) typically coordinates the incident response allocating resources to different sites, which then get distributed to the affected by local entities. The benefits of an allocation decision to the different sites are generally not expressed explicitly as a closed-form utility function because of the complexity of the response and the random nature of the underlying phenomenon we try to contain. We use the classic approach of decomposing the NUM formulation and applying a primal-dual algorithm to achieve optimal higher-level decisions under coupled constraints while modeling the optimized response to the local dynamics with deep reinforcement learning algorithms. The decomposition we propose has several benefits: 1) the entities respond to their local utilities based on a congestion signal conveyed by the ICS upper layers; 2) the complexity of capturing the utility of local responses and their diversity is addressed effectively without sharing local parameters and priorities with the ICS layers above; 3) utilities, known as explicit functions, are approximated as convex functions of the resources allocated; 4) decisions rely on up-to-date data from the ground along with future forecasts.
Paper Structure (19 sections, 1 theorem, 25 equations, 10 figures, 2 tables)

This paper contains 19 sections, 1 theorem, 25 equations, 10 figures, 2 tables.

Key Result

Theorem 1

Assume the sum-utility function $f(Y)$ is strongly concave, i.e., $- \nabla^2 f(Y) \succeq m_f I, ~\forall \ell$, with a positive $m_f$, and an increasing function. Then, we have

Figures (10)

  • Figure 1: Message passing for the higher-layer allocations
  • Figure 2: An example of a social graph for pandemic propagation, where orange nodes represent teenagers, green nodes represent adults, and purple nodes represent the elderly population. The numbers on the nodes denote their unique identifiers.
  • Figure 3: Performance of various policies at different locations.
  • Figure 4: Negative utility vs allocated dose per day to a location along with convex piece-wise linear interpolations
  • Figure 5: An example run of higher-layer allocations with $z = 6$
  • ...and 5 more figures

Theorems & Definitions (1)

  • Theorem 1