Table of Contents
Fetching ...

Position: Key Claims in LLM Research Have a Long Tail of Footnotes

Anna Rogers, Alexandra Sasha Luccioni

TL;DR

Problem: unclear definitions and overconfident claims about LLM capabilities. Approach: define LLMs with three criteria, critique five common claims using empirical and socio-technical perspectives, and propose steps to improve theory and practice. Findings: robustness is context-dependent; few-shot SOTA claims are not universally applicable; scaling is beneficial but not the sole driver; emergent properties are contested; reform is needed for evaluation, reproducibility, and diverse research. Impact: promotes precise terminology, open baselines, and methodological rigor to ensure responsible deployment and robust ML progress.

Abstract

Much of the recent discourse within the ML community has been centered around Large Language Models (LLMs), their functionality and potential -- yet not only do we not have a working definition of LLMs, but much of this discourse relies on claims and assumptions that are worth re-examining. We contribute a definition of LLMs, critically examine five common claims regarding their properties (including 'emergent properties'), and conclude with suggestions for future research directions and their framing.

Position: Key Claims in LLM Research Have a Long Tail of Footnotes

TL;DR

Problem: unclear definitions and overconfident claims about LLM capabilities. Approach: define LLMs with three criteria, critique five common claims using empirical and socio-technical perspectives, and propose steps to improve theory and practice. Findings: robustness is context-dependent; few-shot SOTA claims are not universally applicable; scaling is beneficial but not the sole driver; emergent properties are contested; reform is needed for evaluation, reproducibility, and diverse research. Impact: promotes precise terminology, open baselines, and methodological rigor to ensure responsible deployment and robust ML progress.

Abstract

Much of the recent discourse within the ML community has been centered around Large Language Models (LLMs), their functionality and potential -- yet not only do we not have a working definition of LLMs, but much of this discourse relies on claims and assumptions that are worth re-examining. We contribute a definition of LLMs, critically examine five common claims regarding their properties (including 'emergent properties'), and conclude with suggestions for future research directions and their framing.
Paper Structure (12 sections)