Table of Contents
Fetching ...

Alien Science: Sampling Coherent but Cognitively Unavailable Research Directions from Idea Atoms

Alejandro H. Artiles, Martin Weiss, Levin Brinkmann, Anirudh Goyal, Nasim Rahaman

TL;DR

Cognitive availability is formalized through cognitive availability, the likelihood that a research direction would be naturally proposed by a typical researcher given what they have worked on, to produce research directions that are more diverse than LLM baselines while maintaining coherence.

Abstract

Large language models are adept at synthesizing and recombining familiar material, yet they often fail at a specific kind of creativity that matters most in research: producing ideas that are both coherent and non-obvious to the current community. We formalize this gap through cognitive availability, the likelihood that a research direction would be naturally proposed by a typical researcher given what they have worked on. We introduce a pipeline that (i) decomposes papers into granular conceptual units, (ii) clusters recurring units into a shared vocabulary of idea atoms, and (iii) learns two complementary models: a coherence model that scores whether a set of atoms constitutes a viable direction, and an availability model that scores how likely that direction is to be generated by researchers drawn from the community. We then sample "alien" directions that score high on coherence but low on availability. On a corpus of $\sim$7,500 recent LLM papers from NeurIPS, ICLR and ICML, we validate that (a) conceptual units preserve paper content under reconstruction, (b) idea atoms generalize across papers rather than memorizing paper-specific phrasing, and (c) the Alien sampler produces research directions that are more diverse than LLM baselines while maintaining coherence.

Alien Science: Sampling Coherent but Cognitively Unavailable Research Directions from Idea Atoms

TL;DR

Cognitive availability is formalized through cognitive availability, the likelihood that a research direction would be naturally proposed by a typical researcher given what they have worked on, to produce research directions that are more diverse than LLM baselines while maintaining coherence.

Abstract

Large language models are adept at synthesizing and recombining familiar material, yet they often fail at a specific kind of creativity that matters most in research: producing ideas that are both coherent and non-obvious to the current community. We formalize this gap through cognitive availability, the likelihood that a research direction would be naturally proposed by a typical researcher given what they have worked on. We introduce a pipeline that (i) decomposes papers into granular conceptual units, (ii) clusters recurring units into a shared vocabulary of idea atoms, and (iii) learns two complementary models: a coherence model that scores whether a set of atoms constitutes a viable direction, and an availability model that scores how likely that direction is to be generated by researchers drawn from the community. We then sample "alien" directions that score high on coherence but low on availability. On a corpus of 7,500 recent LLM papers from NeurIPS, ICLR and ICML, we validate that (a) conceptual units preserve paper content under reconstruction, (b) idea atoms generalize across papers rather than memorizing paper-specific phrasing, and (c) the Alien sampler produces research directions that are more diverse than LLM baselines while maintaining coherence.
Paper Structure (52 sections, 1 equation, 7 figures, 2 tables, 1 algorithm)

This paper contains 52 sections, 1 equation, 7 figures, 2 tables, 1 algorithm.

Figures (7)

  • Figure 1: Overview of the Alien Science Sampling pipeline. Papers are distilled into conceptual units, which are clustered into a shared vocabulary of idea atoms. A coherence model learns which atom combinations form viable research directions, while an availability model estimates which combinations typical researchers would propose. Alien directions are sampled by maximizing coherence while minimizing availability.
  • Figure 2: Distribution of reconstruction ratings across conditions. Conceptual units achieve near-perfect reconstruction; atom-only representations lose fidelity as the number of atoms decreases; combining atoms with conceptual units (noisy atoms) restores reconstruction quality. For training, we use the clustering with 2,457 atoms (without noisy atoms), which balances reconstruction fidelity and transferability.
  • Figure 3: Relationship between the number of atoms per paper and reconstruction quality. Papers with more atoms generally achieve higher reconstruction quality, as the LLM decoder needs to infer less missing information. When noisy atoms are included, reconstruction quality improves as the number of noisy atoms increases (i.e., the noisy atoms / clustered atoms ratio increases).
  • Figure 4: Stability of reconstruction (cosine similarity across multiple generations) versus reconstruction quality. High stability indicates the decoder consistently produces the same idea from a given atom combination.
  • Figure 5: Visual comparison of diversity across methods. LLMs show severe concentration on a small subset of atoms, while the Alien sampler achieves broad coverage comparable to random sampling.
  • ...and 2 more figures