DeliverAI: Reinforcement Learning Based Distributed Path-Sharing Network for Food Deliveries

Ashman Mehra; Snehanshu Saha; Vaskar Raychoudhury; Archana Mathur

DeliverAI: Reinforcement Learning Based Distributed Path-Sharing Network for Food Deliveries

Ashman Mehra, Snehanshu Saha, Vaskar Raychoudhury, Archana Mathur

TL;DR

DeliverAI addresses the last-mile food delivery problem by learning path-sharing policies via multi-agent reinforcement learning on a city-scale Overlay Network. It employs Q-learning with inter-agent communication through Preferred Action Sets to route deliveries across hotspots, balancing distance, fleet size, and delivery time to obtain a Pareto-optimal trade-off. Across Chicago-based simulations, DeliverAI reduces total delivery distance by about 13% and fleet size by about 12%, while achieving roughly 50% higher fleet utilization and maintaining high delivery success rates. The approach enables real-time, scalable decision-making for city-wide on-demand food delivery networks, with potential for larger fleets and more advanced RL methods in future work.

Abstract

Delivery of items from the producer to the consumer has experienced significant growth over the past decade and has been greatly fueled by the recent pandemic. Amazon Fresh, Shopify, UberEats, InstaCart, and DoorDash are rapidly growing and are sharing the same business model of consumer items or food delivery. Existing food delivery methods are sub-optimal because each delivery is individually optimized to go directly from the producer to the consumer via the shortest time path. We observe a significant scope for reducing the costs associated with completing deliveries under the current model. We model our food delivery problem as a multi-objective optimization, where consumer satisfaction and delivery costs, both, need to be optimized. Taking inspiration from the success of ride-sharing in the taxi industry, we propose DeliverAI - a reinforcement learning-based path-sharing algorithm. Unlike previous attempts for path-sharing, DeliverAI can provide real-time, time-efficient decision-making using a Reinforcement learning-enabled agent system. Our novel agent interaction scheme leverages path-sharing among deliveries to reduce the total distance traveled while keeping the delivery completion time under check. We generate and test our methodology vigorously on a simulation setup using real data from the city of Chicago. Our results show that DeliverAI can reduce the delivery fleet size by 12\%, the distance traveled by 13%, and achieve 50% higher fleet utilization compared to the baselines.

DeliverAI: Reinforcement Learning Based Distributed Path-Sharing Network for Food Deliveries

TL;DR

Abstract

Paper Structure (29 sections, 12 equations, 21 figures, 6 tables, 3 algorithms)

This paper contains 29 sections, 12 equations, 21 figures, 6 tables, 3 algorithms.

Introduction
Motivation
Contributions
Related work
Problem Definition and Objectives
Problem Definition and Key Terms
Multi-Objective Formulation
Delivery Network Model
Reinforcement Learning for Training RL Agents
DeliverAI Algorithm
Agent Interaction
Agent Request Handling
Experimental Setup And Performance Metrics
Data Source
Simulation Setup
...and 14 more sections

Figures (21)

Figure 1: A schematic to show the path-sharing in a network of nodes (hotspots). In the network, the deliveries $d_1$ and $d_2$ can deviate from their shortest paths (shown by blue and green dotted arrows) to follow a common path (shown by red solid arrows) to save the total distance at the cost of the extra time taken to reach their destination.
Figure 2: Figure shows the Overlay Network Visualisation on the map of Chicago. Layer 1 shows the map of Chicago with some consumer and producer locations (for representation). Layer 2 shows the division of the city into census tracts, with a hotspot placed in each tract. Layer 3 shows the Overlay Network clique, representing the network of delivery vehicles spanning across the city.
Figure 3: Figure shows the events in a section of the Overlay Network to show how deliveries synchronize for forming a delivery sharing pair. 4 deliveries ($d_1,d_2,d_3,d_4$) traverse through the shown section using 3 vehicles ($v_1,v_2,v_3$). Without sharing, 4 vehicles would be required to traverse the same section.
Figure 4: Data Collection Procedure for Simulator Testing. The figure shows how the data is assembled from various sources, organized, and filtered to construct the simulation environment.
Figure 5: Hotspot placement for a few Census Tracts of Chicago. The hotspot is placed at the centroid of all consumers as explained in Section \ref{['sec:delivery-network']}.
...and 16 more figures

DeliverAI: Reinforcement Learning Based Distributed Path-Sharing Network for Food Deliveries

TL;DR

Abstract

DeliverAI: Reinforcement Learning Based Distributed Path-Sharing Network for Food Deliveries

Authors

TL;DR

Abstract

Table of Contents

Figures (21)