Table of Contents
Fetching ...

Solving Online Resource-Constrained Scheduling for Follow-Up Observation in Astronomy: a Reinforcement Learning Approach

Yajie Zhang, Ce Yu, Chao Sun, Jizeng Wei, Junhan Ju, Shanjiang Tang

TL;DR

This paper tackles online, resource-constrained scheduling for follow-up astronomical observations with a telescope array, formulating the problem as an MDP aimed at minimizing average task slowdown. It introduces ROARS, a reinforcement learning framework that encodes schedules as DAGs and uses iterative, local rewriting guided by a Child-Sum Tree-LSTM-based graph encoder, with region- and rule-selection policies trained end-to-end. Through simulations on realistic, real-world-inspired data, ROARS consistently outperforms classical online heuristics and approaches offline performance, while generalizing to unseen task distributions and extending to distributed arrays. The work advances practical, scalable decision-making for time-critical ToO observations and lays groundwork for multi-objective extensions and deeper integration with global telescope networks.

Abstract

In the astronomical observation field, determining the allocation of observation resources of the telescope array and planning follow-up observations for targets of opportunity (ToOs) are indispensable components of astronomical scientific discovery. This problem is computationally challenging, given the online observation setting and the abundance of time-varying factors that can affect whether an observation can be conducted. This paper presents ROARS, a reinforcement learning approach for online astronomical resource-constrained scheduling. To capture the structure of the astronomical observation scheduling, we depict every schedule using a directed acyclic graph (DAG), illustrating the dependency of timing between different observation tasks within the schedule. Deep reinforcement learning is used to learn a policy that can improve the feasible solution by iteratively local rewriting until convergence. It can solve the challenge of obtaining a complete solution directly from scratch in astronomical observation scenarios, due to the high computational complexity resulting from numerous spatial and temporal constraints. A simulation environment is developed based on real-world scenarios for experiments, to evaluate the effectiveness of our proposed scheduling approach. The experimental results show that ROARS surpasses 5 popular heuristics, adapts to various observation scenarios and learns effective strategies with hindsight.

Solving Online Resource-Constrained Scheduling for Follow-Up Observation in Astronomy: a Reinforcement Learning Approach

TL;DR

This paper tackles online, resource-constrained scheduling for follow-up astronomical observations with a telescope array, formulating the problem as an MDP aimed at minimizing average task slowdown. It introduces ROARS, a reinforcement learning framework that encodes schedules as DAGs and uses iterative, local rewriting guided by a Child-Sum Tree-LSTM-based graph encoder, with region- and rule-selection policies trained end-to-end. Through simulations on realistic, real-world-inspired data, ROARS consistently outperforms classical online heuristics and approaches offline performance, while generalizing to unseen task distributions and extending to distributed arrays. The work advances practical, scalable decision-making for time-critical ToO observations and lays groundwork for multi-objective extensions and deeper integration with global telescope networks.

Abstract

In the astronomical observation field, determining the allocation of observation resources of the telescope array and planning follow-up observations for targets of opportunity (ToOs) are indispensable components of astronomical scientific discovery. This problem is computationally challenging, given the online observation setting and the abundance of time-varying factors that can affect whether an observation can be conducted. This paper presents ROARS, a reinforcement learning approach for online astronomical resource-constrained scheduling. To capture the structure of the astronomical observation scheduling, we depict every schedule using a directed acyclic graph (DAG), illustrating the dependency of timing between different observation tasks within the schedule. Deep reinforcement learning is used to learn a policy that can improve the feasible solution by iteratively local rewriting until convergence. It can solve the challenge of obtaining a complete solution directly from scratch in astronomical observation scenarios, due to the high computational complexity resulting from numerous spatial and temporal constraints. A simulation environment is developed based on real-world scenarios for experiments, to evaluate the effectiveness of our proposed scheduling approach. The experimental results show that ROARS surpasses 5 popular heuristics, adapts to various observation scenarios and learns effective strategies with hindsight.

Paper Structure

This paper contains 24 sections, 10 equations, 6 figures, 4 tables, 1 algorithm.

Figures (6)

  • Figure 1: Conceptual illustration of distributed telescope array follow-up observation for targets of opportunity.
  • Figure 2: Schematic diagram of the relationship between the follow-up observation target and observation task. We give the division of a target of opportunity that arises during a sky survey into multiple follow-up observation tasks (i.e., multiple exposures) according to its observational requirements in continuous observation mode or cadence observation mode. As an example, this follow-up target requires simultaneous observations in the u,g,i three bands.
  • Figure 3: An illustrative example of ROARS. By parsing incoming follow-up targets from the astronomical observation simulation environment, the scheduling algorithm completes resource allocation and outputs the observation plan to each observation site. We give an illustration of the rewriting optimization strategy and an example of two potential task schedules at a single observation site and their corresponding graphical representations.
  • Figure 4: The location information of observation sites and fields in dataset generation.
  • Figure 5: Experimental results of ROARS varying the following task properties: (a) task frequency of the incoming ToOs; (b) duration of follow-up observation; (c) resource distribution; and (d) single exposure time. Except for the influence of the properties to be explored on the results, the remaining properties in each experiment are set to be the same as those in the experiment presented in Table \ref{['single site comparison table']}.
  • ...and 1 more figures