Disk2Planet: A Robust and Automated Machine Learning Tool for Parameter Inference in Disk-Planet Systems

Shunyuan Mao; Ruobing Dong; Kwang Moo Yi; Lu Lu; Sifan Wang; Paris Perdikaris

Disk2Planet: A Robust and Automated Machine Learning Tool for Parameter Inference in Disk-Planet Systems

Shunyuan Mao, Ruobing Dong, Kwang Moo Yi, Lu Lu, Sifan Wang, Paris Perdikaris

TL;DR

This work introduces Disk2Planet, a machine-learning-based tool to infer key parameters in disk–planet systems from observed protoplanetary disk structures, and demonstrates that the tool achieves percent-level or higher accuracy, and is able to handle missing data and unknown levels of noise.

Abstract

We introduce Disk2Planet, a machine learning-based tool to infer key parameters in disk-planet systems from observed protoplanetary disk structures. Disk2Planet takes as input the disk structures in the form of two-dimensional density and velocity maps, and outputs disk and planet properties, that is, the Shakura--Sunyaev viscosity, the disk aspect ratio, the planet--star mass ratio, and the planet's radius and azimuth. We integrate the Covariance Matrix Adaptation Evolution Strategy (CMA--ES), an evolutionary algorithm tailored for complex optimization problems, and the Protoplanetary Disk Operator Network (PPDONet), a neural network designed to predict solutions of disk--planet interactions. Our tool is fully automated and can retrieve parameters in one system in three minutes on an Nvidia A100 graphics processing unit. We empirically demonstrate that our tool achieves percent-level or higher accuracy, and is able to handle missing data and unknown levels of noise.

Disk2Planet: A Robust and Automated Machine Learning Tool for Parameter Inference in Disk-Planet Systems

TL;DR

Abstract

Paper Structure (12 sections, 4 equations, 6 figures)

This paper contains 12 sections, 4 equations, 6 figures.

Introduction
Method
The PPDONet-based Forward Problem Solver
The Score for Data-Output Comparisons
The Optimization Algorithm CMA--ES
Performance
The baseline case --- input data with surface density only
Input data with multiple quantities
Input data with noise
Input data with missing parts
Advantages over existing inverse problem solvers
Conclusions and future perspectives

Figures (6)

Figure 1: The framework of conventional inverse problem solvers. Each iteration begins with an estimation of the system configuration, which includes parameters in a disk-planet system. This configuration is sent to a numerical solver to generate the resulting disk maps. Expert analysis then evaluates the visual discrepancies between the simulation outcomes and the observational data, leading to iterative refinements of the system parameters.
Figure 2: A flow chart illustrating the iterative parameter update procedure in Disk2Planet inverse problem solver. The solver iterates a 5D Gaussian distribution representing the most probable parameters ($\alpha$, $h_\mathrm{0}$, $q$, $r_\mathrm{p}$, $\theta_\mathrm{p}$) that result in the minimal difference between the model and the input data. At each iteration, the Gaussian distribution is refined by testing 128 sets of parameters sampled from the current distribution. These parameters are processed by the ML-based solver PPDONet (§\ref{['sec: network']}) to predict the corresponding disk maps. Differences between these maps and the data are quantified by the score (§\ref{['sec:compare']}). The CMA--ES optimizer (§\ref{['sec: algorithm']}) updates the Gaussian distribution to converge towards parameter sets with smaller scores. This process iterates until the distribution converges to the optimal parameters. On the right, we show an example of the inference process: (a) the input surface density map and its true parameters; (b) the evolution of the best score over iterations; (c) the inferred parameters and the corresponding surface density solution.
Figure 3: Score calculation for three cases of input data. Block A: For noisy input data, the score is calculated the same way as for noise-free data. Block B: For input data with missing areas, only the available pixels are used. Block C: For input data containing multiple quantities, the error for each quantity is calculated, and then the average of all errors is taken as the score.
Figure 4: Ground truth and inferred planet masses in 256 tests using full noise-free maps of surface density as input. The tests achieve an $r2$-score (Eq. \ref{['eq:r2']}) of 0.9994 and an inference uncertainty $\sigma$ of 0.0073, indicating near-perfect agreement (red solid line). See §\ref{['sec:input-sigma']} for details.
Figure 5: Representative examples of the input datasets, and the corresponding parameter inference uncertainties ($\sigma$). The input datasets contain images of, from top to bottom, $\Sigma$ (a), $\Sigma$$+$$v_\mathrm{LOS}$ (b), $\Sigma$$+$$v_\mathrm{r}$$+$$v_\mathrm{\theta}$ (c), $\Sigma$ with noise (d), $\Sigma$$+$$v_\mathrm{LOS}$ with noise (e), radially cropped $\Sigma$ (f), and azimuthally cropped $\Sigma$ (g). $256$ tests are run for each. The uncertainties are calculated as half the difference between the error distribution's $84^{\mathrm{th}}$ and $16^{\mathrm{th}}$ percentiles. The numbers on the bar plot represent the uncertainties for the baseline dataset (a), and how the uncertainties in the other datasets (b)--(g) compared to those in the baseline case.
...and 1 more figures

Disk2Planet: A Robust and Automated Machine Learning Tool for Parameter Inference in Disk-Planet Systems

TL;DR

Abstract

Disk2Planet: A Robust and Automated Machine Learning Tool for Parameter Inference in Disk-Planet Systems

Authors

TL;DR

Abstract

Table of Contents

Figures (6)