Table of Contents
Fetching ...

Deep Reinforcement Learning for Demand Driven Services in Logistics and Transportation Systems: A Survey

Zefang Zong, Jingwei Wang, Tao Feng, Tong Xia, Depeng Jin, Yong Li

TL;DR

This survey defines DDS, then highlights common applications and important decision/control problems within, and comprehensively introduces the existing DRL solutions, and introduces open simulation environments for development and evaluation of DDS applications.

Abstract

Recent technology development brings the boom of numerous new Demand-Driven Services (DDS) into urban lives, including ridesharing, on-demand delivery, express systems and warehousing. In DDS, a service loop is an elemental structure, including its service worker, the service providers and corresponding service targets. The service workers should transport either people or parcels from the providers to the target locations. Various planning tasks within DDS can thus be classified into two individual stages: 1) Dispatching, which is to form service loops from demand/supply distributions, and 2) Routing, which is to decide specific serving orders within the constructed loops. Generating high-quality strategies in both stages is important to develop DDS but faces several challenges. Meanwhile, deep reinforcement learning (DRL) has been developed rapidly in recent years. It is a powerful tool to solve these problems since DRL can learn a parametric model without relying on too many problem-based assumptions and optimize long-term effects by learning sequential decisions. In this survey, we first define DDS, then highlight common applications and important decision/control problems within. For each problem, we comprehensively introduce the existing DRL solutions. We also introduce open simulation environments for development and evaluation of DDS applications. Finally, we analyze remaining challenges and discuss further research opportunities in DRL solutions for DDS.

Deep Reinforcement Learning for Demand Driven Services in Logistics and Transportation Systems: A Survey

TL;DR

This survey defines DDS, then highlights common applications and important decision/control problems within, and comprehensively introduces the existing DRL solutions, and introduces open simulation environments for development and evaluation of DDS applications.

Abstract

Recent technology development brings the boom of numerous new Demand-Driven Services (DDS) into urban lives, including ridesharing, on-demand delivery, express systems and warehousing. In DDS, a service loop is an elemental structure, including its service worker, the service providers and corresponding service targets. The service workers should transport either people or parcels from the providers to the target locations. Various planning tasks within DDS can thus be classified into two individual stages: 1) Dispatching, which is to form service loops from demand/supply distributions, and 2) Routing, which is to decide specific serving orders within the constructed loops. Generating high-quality strategies in both stages is important to develop DDS but faces several challenges. Meanwhile, deep reinforcement learning (DRL) has been developed rapidly in recent years. It is a powerful tool to solve these problems since DRL can learn a parametric model without relying on too many problem-based assumptions and optimize long-term effects by learning sequential decisions. In this survey, we first define DDS, then highlight common applications and important decision/control problems within. For each problem, we comprehensively introduce the existing DRL solutions. We also introduce open simulation environments for development and evaluation of DDS applications. Finally, we analyze remaining challenges and discuss further research opportunities in DRL solutions for DDS.

Paper Structure

This paper contains 48 sections, 1 equation, 11 figures, 6 tables.

Figures (11)

  • Figure 1: The visualization of two independent service loops using instant delivery as an example. The restaurant, customer, and courier serve as the service provider, service target, and service worker.
  • Figure 2: The overview of DDS problems, including the dispatching stage and the routing stage. We demonstrate the transformation from originated mathematical formulation to industrial applications in the vertical axis, and distinguish the two different planning stages in the horizontal axis. Note that the two stages are not rigidly separated, but such a classification is necessary to concentrate on primary challenges in different practical scenarios. A low demand/worker ratio implies that the primary challenge is to determine how workers and demands should be matched, while a large one indicates that the major optimization space lies in the routing stage. We will discuss such a relationship in details in Sec \ref{['Sec_background']}.3. CVRP here refers to the capacitated vehicle routing problems with four common variants below, which will be introduced in detail in Sec 4.
  • Figure 3: A sample of different grid-based navigation and partitioning schemes: (a) 4-way connectivity through cardinal directions, (b) 8-way connectivity with ordinal directions, (c) 6-way connectivity using hexagon-based representations. (d) Full connectivity, can also be modeled as a graph structure.
  • Figure 4: Reinforcement learning control loop.
  • Figure 5: Classification and development of DRL algorithms.
  • ...and 6 more figures