AI for Scientific Discovery is a Social Problem
Georgia Channing, Avijit Ghosh
TL;DR
The paper argues that AI for scientific discovery is constrained more by social and institutional factors than by technical limits, identifying four interlinked barriers: community dysfunction, misaligned research priorities, data fragmentation, and infrastructure inequities. It critiques the AI-scientist myth, emphasizing mechanistic understanding and experimental grounding over predictive performance, and proposes a multi-pronged agenda—cross-disciplinary education, upstream benchmarking, standardized data practices, and community-owned infrastructure—to align incentives and broaden participation. Through case studies (e.g., CASP, The Materials Project, Schmidt Fellowship) it illustrates how sustained, community-governed efforts can magnify downstream impact beyond isolated advances. The work argues for reframing AI for science as a collective social project where durable collaboration and equitable participation are prerequisites for technical progress and real scientific discovery.
Abstract
Artificial intelligence (AI) is increasingly applied to scientific research, but its benefits remain unevenly distributed across communities and disciplines. While technical challenges such as limited data, fragmented standards, and unequal access to computational resources exist, social and institutional factors are often the primary constraints. Narratives emphasizing autonomous "AI scientists," under-recognition of data and infrastructure work, misaligned incentives, and gaps between domain experts and machine learning researchers all limit the impact of AI on scientific discovery. This paper highlights four interconnected challenges: community coordination, misalignment of research priorities with upstream needs, data fragmentation, and infrastructure inequities. We argue that addressing these challenges requires not only technical innovation but also intentional efforts in community-building, cross-disciplinary education, shared benchmarks, and accessible infrastructure. We call for reframing AI for science as a collective social project, where sustainable collaboration and equitable participation are treated as prerequisites for technical progress
