Table of Contents
Fetching ...

Talking with the Latents -- how to convert your LLM into an astronomer

Ilay Kamai, Marc-Huertas Company, Mike J. Smith, Hagai B. Perets

TL;DR

Kamai et al. introduce Latent Interpeter (LI), a framework that fuses astrophysical latent features from foundation models with a pre-trained LLM through an Adapter Network and LoRA in a teacher–student setup. The approach enables the LI to reason over real scientific data and be steered by physically meaningful latent directions, with larger LLMs delivering stronger performance and interpretability. Experimental results demonstrate improved parameter inference, emergent multimodal reasoning, and controllable latent-space navigation, suggesting a scalable path toward scientifically capable LLMs. While demonstrated in astrophysics, the method is presented as domain-agnostic, offering a general interface between scientific latent spaces and natural language guidance.

Abstract

Recent advances in Large Language Models (LLMs) offer unique opportunities for scientific tasks, yet their ability to reason over complex numerical data remains largely unexplored. We propose a simple mechanism to introduce domain-specific physical knowledge into LLMs by fusing pre-trained latent physical features with a pre-trained language model. Our method employs a teacher-student knowledge distillation framework where a large LLM (teacher) generates synthetic question-answer supervision to transfer physical reasoning to a smaller LLM (student). The student is conditioned on latent physical features and trained via a lightweight adapter and Low-Rank Adaptation (LoRA). We demonstrate that this approach, applied to models with 1B, 8B, and 32B parameters, enables effective reasoning over real scientific data. Our models substantially outperform strong baselines, such as Gemini 3 Pro, across multiple downstream tasks without task-specific fine-tuning. We show that the model combines latent information with general physical understanding to predict complex properties and can be "steered" by identifying physically meaningful directions in the latent space. This allows for explicit physical manipulation and natural language interpretation of latent structures. While our experiments focus on astrophysics, the framework is domain-agnostic and applicable to various scientific fields. Our main contribution is a general framework for using LLMs as interpretable interfaces to scientific latent spaces, enabling a single model to perform diverse tasks through natural language guidance. This work marks a step toward developing scientifically capable and useful LLMs.

Talking with the Latents -- how to convert your LLM into an astronomer

TL;DR

Kamai et al. introduce Latent Interpeter (LI), a framework that fuses astrophysical latent features from foundation models with a pre-trained LLM through an Adapter Network and LoRA in a teacher–student setup. The approach enables the LI to reason over real scientific data and be steered by physically meaningful latent directions, with larger LLMs delivering stronger performance and interpretability. Experimental results demonstrate improved parameter inference, emergent multimodal reasoning, and controllable latent-space navigation, suggesting a scalable path toward scientifically capable LLMs. While demonstrated in astrophysics, the method is presented as domain-agnostic, offering a general interface between scientific latent spaces and natural language guidance.

Abstract

Recent advances in Large Language Models (LLMs) offer unique opportunities for scientific tasks, yet their ability to reason over complex numerical data remains largely unexplored. We propose a simple mechanism to introduce domain-specific physical knowledge into LLMs by fusing pre-trained latent physical features with a pre-trained language model. Our method employs a teacher-student knowledge distillation framework where a large LLM (teacher) generates synthetic question-answer supervision to transfer physical reasoning to a smaller LLM (student). The student is conditioned on latent physical features and trained via a lightweight adapter and Low-Rank Adaptation (LoRA). We demonstrate that this approach, applied to models with 1B, 8B, and 32B parameters, enables effective reasoning over real scientific data. Our models substantially outperform strong baselines, such as Gemini 3 Pro, across multiple downstream tasks without task-specific fine-tuning. We show that the model combines latent information with general physical understanding to predict complex properties and can be "steered" by identifying physically meaningful directions in the latent space. This allows for explicit physical manipulation and natural language interpretation of latent structures. While our experiments focus on astrophysics, the framework is domain-agnostic and applicable to various scientific fields. Our main contribution is a general framework for using LLMs as interpretable interfaces to scientific latent spaces, enabling a single model to perform diverse tasks through natural language guidance. This work marks a step toward developing scientifically capable and useful LLMs.
Paper Structure (14 sections, 13 figures, 5 tables)

This paper contains 14 sections, 13 figures, 5 tables.

Figures (13)

  • Figure 1: An example of our question-answer dataset. The follow-up questions are added randomly in $70\%$ of the cases.
  • Figure 2: High-level diagram of the Latent Interpeter model. Blue colors represent the spectra model. Yellow colors represent the LLM.
  • Figure 3: Example of the steering effect for two different concepts - Dwarf to Giants (left) and Young to Old (right). In both panels, the black star represents the true parameters of the star under consideration. The colored circles represent the generated parameters with different $\alpha$ values. The colored lines represent theoretical isochrones - curves with constant age. The dashed black line represents a separation between dwarfs and giants from Ciardi2011. The background gray points represent the true parameters of the entire sample set.
  • Figure 4: Answers generated by the model during the steering experiment, for the example in Figure \ref{['fig:steering_example']}, left panel, for $\alpha=0,-2,2$
  • Figure 5: Distributions of latent feature values for SC (left) and SViT (right) models.
  • ...and 8 more figures