Table of Contents
Fetching ...

Reinforcement learning

Sarod Yatawatta

TL;DR

This paper surveys modern deep reinforcement learning methods and their applicability to astronomy, grounding the discussion in the Markov decision process $(\mathcal{S},\mathcal{A},\mathcal{R},\mathcal{P})$ and key quantities such as $Q(s,a)$, $V(s)$, and $\pi$. It covers model‑free algorithms for discrete and continuous actions (DDPG, TD3, SAC) with components like experience replay and target networks, and expands to model‑based RL using probabilistic ensembles and trajectory sampling (PETS). A notable contribution is the introduction of hint assisted RL to inject domain knowledge, along with practical guidance for applying RL to astronomical tasks and a calibration example using AIC as the reward. The work emphasizes data efficiency, planning under uncertainty, and provides public code to facilitate rapid adoption in data‑intensive astronomical workflows.

Abstract

Observing celestial objects and advancing our scientific knowledge about them involves tedious planning, scheduling, data collection and data post-processing. Many of these operational aspects of astronomy are guided and executed by expert astronomers. Reinforcement learning is a mechanism where we (as humans and astronomers) can teach agents of artificial intelligence to perform some of these tedious tasks. In this paper, we will present a state of the art overview of reinforcement learning and how it can benefit astronomy.

Reinforcement learning

TL;DR

This paper surveys modern deep reinforcement learning methods and their applicability to astronomy, grounding the discussion in the Markov decision process and key quantities such as , , and . It covers model‑free algorithms for discrete and continuous actions (DDPG, TD3, SAC) with components like experience replay and target networks, and expands to model‑based RL using probabilistic ensembles and trajectory sampling (PETS). A notable contribution is the introduction of hint assisted RL to inject domain knowledge, along with practical guidance for applying RL to astronomical tasks and a calibration example using AIC as the reward. The work emphasizes data efficiency, planning under uncertainty, and provides public code to facilitate rapid adoption in data‑intensive astronomical workflows.

Abstract

Observing celestial objects and advancing our scientific knowledge about them involves tedious planning, scheduling, data collection and data post-processing. Many of these operational aspects of astronomy are guided and executed by expert astronomers. Reinforcement learning is a mechanism where we (as humans and astronomers) can teach agents of artificial intelligence to perform some of these tedious tasks. In this paper, we will present a state of the art overview of reinforcement learning and how it can benefit astronomy.
Paper Structure (22 sections, 30 equations, 12 figures, 4 tables, 4 algorithms)

This paper contains 22 sections, 30 equations, 12 figures, 4 tables, 4 algorithms.

Figures (12)

  • Figure 1: An agent interacting with its environment. The agent receives an observation and performs an action and receives a reward corresponding to the action.
  • Figure 2: The maze environment with $5$ valid states $0,1,\ldots,4$. The agent can move (act) $\leftarrow$,$\rightarrow$,$\uparrow$, or $\downarrow$. The state $\mathcal{S}$ is a discrete space with $5$ states and the action $\mathcal{A}$ is also a discrete space with $4$ actions.
  • Figure 3: An RL agent composed of an actor and a critic.
  • Figure 4: Model based RL. A dynamics model representing the environment is created and used by the agent.
  • Figure 5: Hint assisted RL. An external hint is directly provided to the actor in the RL agent.
  • ...and 7 more figures