Artificial intelligence for science: The easy and hard problems

Ruairidh M. Battleday; Samuel J. Gershman

Artificial intelligence for science: The easy and hard problems

Ruairidh M. Battleday, Samuel J. Gershman

TL;DR

The authors can make progress on understanding how humans solve the hard problem by studying the cognitive science of scientists, and then use the results to design new computational agents that automatically infer and update their scientific paradigms.

Abstract

A suite of impressive scientific discoveries have been driven by recent advances in artificial intelligence. These almost all result from training flexible algorithms to solve difficult optimization problems specified in advance by teams of domain scientists and engineers with access to large amounts of data. Although extremely useful, this kind of problem solving only corresponds to one part of science - the "easy problem." The other part of scientific research is coming up with the problem itself - the "hard problem." Solving the hard problem is beyond the capacities of current algorithms for scientific discovery because it requires continual conceptual revision based on poorly defined constraints. We can make progress on understanding how humans solve the hard problem by studying the cognitive science of scientists, and then use the results to design new computational agents that automatically infer and update their scientific paradigms.

Artificial intelligence for science: The easy and hard problems

TL;DR

Abstract

Paper Structure (16 sections, 3 equations, 4 figures)

This paper contains 16 sections, 3 equations, 4 figures.

The easy problem
The hard problem
The problem with optimization
Solving the hard problem
Case study 1: The discovery of oxygen
Case Study 2: The electromagnetic field
Case Study 3: Protein folding
Understanding the hard problem
Towards scalable AI scientists that solve the hard problem
Acknowledgments

Figures (4)

Figure 1: The discovery of oxygen by STAHLp. The input is two beliefs, where the first belief, inherited from Georg Ernst Stahl, states that mercury (M) is composed of calx-of-mercury (CM) and phlogiston (Ph). The second, reflecting an empirical observation by Joseph Priestley, states that calx-of-mercury is composed of mercury and a colorless gas (O). Through forward chaining the system reaches a circularity present in the initial beliefs---that mercury-calx can be decomposed into itself, phlogiston, and oxygen. The belief revision rules are then triggered, leading to a set of effect hypotheses, then cause hypotheses, giving the revisions needed to overcome the circularity. Reproduced from rose1986chemical.
Figure 2: The steps that AI Feynman goes through when applied to solve a scientific problem, given in the form of a mystery table. Reproduced from udrescu20tegmark.
Figure 3: AI Feynman recovers the correct expression at the top of the diagram by applying its problem-solving steps. Reproduced from udrescu20tegmark.
Figure 4: AlphaFold2 input is a 1D amino-acid sequence and any related sequences from genomic databases (the MSA), as well as distance information from previously derived structures. The Evoformer module learns flexible and increasingly abstract re-representations of these inputs over many layers. The Structure module conditions on this information to output rigid 3D frames for each amino acid residue, before adjusting the frames to form a physically plausible structure during fine-tuning. Reproduced from jumper2021highly.

Artificial intelligence for science: The easy and hard problems

TL;DR

Abstract

Artificial intelligence for science: The easy and hard problems

Authors

TL;DR

Abstract

Table of Contents

Figures (4)