The False Dawn: Reevaluating Google's Reinforcement Learning for Chip Macro Placement

Igor L. Markov

The False Dawn: Reevaluating Google's Reinforcement Learning for Chip Macro Placement

Igor L. Markov

TL;DR

A meta-analysis shows how two separate evaluations filled in the gaps and demonstrated that Google RL lags behind human designers, a well-known algorithm, and generally-available commercial software, while being slower; and in a 2023 open research contest, RL methods weren't in top 5.

Abstract

Reinforcement learning (RL) for physical design of silicon chips in a Google 2021 Nature paper stirred controversy due to poorly documented claims that raised eyebrows and drew critical media coverage. The paper withheld critical methodology steps and most inputs needed to reproduce results. Our meta-analysis shows how two separate evaluations filled in the gaps and demonstrated that Google RL lags behind (i) human designers, (ii) a well-known algorithm (Simulated Annealing), and (iii) generally-available commercial software, while being slower; and in a 2023 open research contest, RL methods weren't in top 5. Crosschecked data indicate that the integrity of the Nature paper is substantially undermined owing to errors in conduct, analysis and reporting. Before publishing, Google rebuffed internal allegations of fraud, which still stand. We note policy implications and conclusions for chip design.

The False Dawn: Reevaluating Google's Reinforcement Learning for Chip Macro Placement

TL;DR

Abstract

Paper Structure (21 sections, 1 equation, 1 figure, 4 tables)

This paper contains 21 sections, 1 equation, 1 figure, 4 tables.

Introduction
Background
Initial doubts
Unsubstantiated claims and insufficient reporting
A flawed optimization proxy
Use of handicapped techniques
Questionable baselines
Additional evidence
Methods
Results
Likely imitations
Did [1] improve SOTA?
Reproduction attempts
Open contest at MLCAD 2023
Rebuttals to critiques of [1]
...and 6 more sections

Figures (1)

Figure 1: Layouts from SB with macros in red and standard cells in green, locations produced by RL (left) and RePlAce (right) for the ibm10 benchmark from ICCAD04. Limiting macro locations to a coarse grid (left) leads to spreading of small macros (red squares on the grid) and elongates connecting wires: from 27.5 units (right) to 44.1 units (left) for ibm10SB). Higher area utilization and many macros of different sizes make the ICCAD 2004 benchmarks ICCAD04 challenging compared to benchmarks in Nature and PeerReviews.

The False Dawn: Reevaluating Google's Reinforcement Learning for Chip Macro Placement

TL;DR

Abstract

The False Dawn: Reevaluating Google's Reinforcement Learning for Chip Macro Placement

Authors

TL;DR

Abstract

Table of Contents

Figures (1)