Table of Contents
Fetching ...

Can AI Understand Our Universe? Test of Fine-Tuning GPT by Astrophysical Data

Yu Wang, Shu-Rui Zhang, Aidin Momtaz, Rahim Moradi, Fatemeh Rastegarnia, Narek Sahakyan, Soroush Shakeri, Liang Li

TL;DR

This article fine-tune the generative pre-trained transformer (GPT) model by the astronomical data from the observations of galaxies, quasars, stars, gamma-ray bursts, and the simulations of black holes, and considers this as a successful test, marking the LLM's proven efficacy in scientific research.

Abstract

ChatGPT has been the most talked-about concept in recent months, captivating both professionals and the general public alike, and has sparked discussions about the changes that artificial intelligence (AI) will bring to the world. As physicists and astrophysicists, we are curious about if scientific data can be correctly analyzed by large language models (LLMs) and yield accurate physics. In this article, we fine-tune the generative pre-trained transformer (GPT) model by the astronomical data from the observations of galaxies, quasars, stars, gamma-ray bursts (GRBs), and the simulations of black holes (BHs), the fine-tuned model demonstrates its capability to classify astrophysical phenomena, distinguish between two types of GRBs, deduce the redshift of quasars, and estimate BH parameters. We regard this as a successful test, marking the LLM's proven efficacy in scientific research. With the ever-growing volume of multidisciplinary data and the advancement of AI technology, we look forward to the emergence of a more fundamental and comprehensive understanding of our universe. This article also shares some interesting thoughts on data collection and AI design. Using the approach of understanding the universe - looking outward at data and inward for fundamental building blocks - as a guideline, we propose a method of series expansion for AI, suggesting ways to train and control AI that is smarter than humans.

Can AI Understand Our Universe? Test of Fine-Tuning GPT by Astrophysical Data

TL;DR

This article fine-tune the generative pre-trained transformer (GPT) model by the astronomical data from the observations of galaxies, quasars, stars, gamma-ray bursts, and the simulations of black holes, and considers this as a successful test, marking the LLM's proven efficacy in scientific research.

Abstract

ChatGPT has been the most talked-about concept in recent months, captivating both professionals and the general public alike, and has sparked discussions about the changes that artificial intelligence (AI) will bring to the world. As physicists and astrophysicists, we are curious about if scientific data can be correctly analyzed by large language models (LLMs) and yield accurate physics. In this article, we fine-tune the generative pre-trained transformer (GPT) model by the astronomical data from the observations of galaxies, quasars, stars, gamma-ray bursts (GRBs), and the simulations of black holes (BHs), the fine-tuned model demonstrates its capability to classify astrophysical phenomena, distinguish between two types of GRBs, deduce the redshift of quasars, and estimate BH parameters. We regard this as a successful test, marking the LLM's proven efficacy in scientific research. With the ever-growing volume of multidisciplinary data and the advancement of AI technology, we look forward to the emergence of a more fundamental and comprehensive understanding of our universe. This article also shares some interesting thoughts on data collection and AI design. Using the approach of understanding the universe - looking outward at data and inward for fundamental building blocks - as a guideline, we propose a method of series expansion for AI, suggesting ways to train and control AI that is smarter than humans.
Paper Structure (2 sections, 1 equation, 7 figures)

This paper contains 2 sections, 1 equation, 7 figures.

Figures (7)

  • Figure 1: Example of quasar, galaxy, star, and BAL spectra. The lighter colors depict the original high-resolution spectral data, while the darker colors represent the downsampled version, consisting of 100 data points for each spectrum.
  • Figure 2: Confusion matrix of four SDSS classes. Indicating, for e.g., 40 quasars are correctly predicted as quasar, 3 quasars are wrongly predicted as galaxy, and 7 quasars are wrongly predicted as BAL.
  • Figure 3: Top: histogram of relative error of the predicted redshift. Bottom: example of redshift prediction of 100 random quasars.
  • Figure 4: Example of the feature selection procedure by SVM. Left: two parameters ('FLUX_BATSE_64' and 'FLNC_BAND_ERGFLUX') from the catalog are selected to compute the decision boundary and then to compute the accuracy of classification. Right: the corresponding ROC curve.
  • Figure 5: BH spectra of different spin directions, values and inclination angles. Units are arbitrarily scaled.
  • ...and 2 more figures