My Precious Crash Data: Barriers and Opportunities in Encouraging Autonomous Driving Companies to Share Safety-Critical Data
Hauke Sandhaus, Angel Hsing-Chi Hwang, Wendy Ju, Qian Yang
TL;DR
This paper investigates why safety-critical autonomous-vehicle data are rarely shared and how incentives shape sharing practices. Through twelve insider interviews, it identifies two fundamental barriers: data inherently encode proprietary AV design knowledge and are treated as private competitive assets rather than public goods. It reframes data sharing as an incentives problem and proposes strategies including clarifying public vs private knowledge, designing data tools that decouple data from embedded know-how, and leveraging academic intermediaries and policy frameworks. The work outlines practical pathways to promote data sharing that preserve competitive edges while advancing public safety in autonomous driving.
Abstract
Safety-critical data, such as crash and near-crash records, are crucial to improving autonomous vehicle (AV) design and development. Sharing such data across AV companies, academic researchers, regulators, and the public can help make all AVs safer. However, AV companies rarely share safety-critical data externally. This paper aims to pinpoint why AV companies are reluctant to share safety-critical data, with an eye on how these barriers can inform new approaches to promote sharing. We interviewed twelve AV company employees who actively work with such data in their day-to-day work. Findings suggest two key, previously unknown barriers to data sharing: (1) Datasets inherently embed salient knowledge that is key to improving AV safety and are resource-intensive. Therefore, data sharing, even within a company, is fraught with politics. (2) Interviewees believed AV safety knowledge is private knowledge that brings competitive edges to their companies, rather than public knowledge for social good. We discuss the implications of these findings for incentivizing and enabling safety-critical AV data sharing, specifically, implications for new approaches to (1) debating and stratifying public and private AV safety knowledge, (2) innovating data tools and data sharing pipelines that enable easier sharing of public AV safety data and knowledge; (3) offsetting costs of curating safety-critical data and incentivizing data sharing.
