Table of Contents
Fetching ...

Navigating Uncertainties: Understanding How GenAI Developers Document Their Models on Open-Source Platforms

Ningjing Tang, Megan Li, Amy Winecoff, Michael Madaio, Hoda Heidari, Hong Shen

TL;DR

This study investigates how GenAI developers document models on open-source platforms through 13 semi-structured interviews analyzed with reflexive thematic analysis. It identifies three core uncertainty dimensions—what content to document, how to document, and who should be responsible—arising from tension between traditional OSS norms, market pressures, and responsible AI guidelines. The findings illuminate motivations (reuse/maintenance, promotion, ethics) and pervasive ambiguities across content, process, and governance, with implications for policy, platform governance, and research toward better evaluation infrastructures and clearer roles. The work highlights the need for sociotechnical tooling, community norms, and clearer accountability to enable meaningful, actionable GenAI documentation in open-source ecosystems.

Abstract

Model documentation plays a crucial role in promoting transparency and responsible development of AI systems. With the rise of Generative AI (GenAI), open-source platforms have increasingly become hubs for hosting and distributing these models, prompting platforms like Hugging Face to develop dedicated model documentation guidelines that align with responsible AI principles. Despite these growing efforts, there remains a lack of understanding of how developers document their GenAI models on open-source platforms. Through interviews with 13 GenAI developers active on open-source platforms, we provide empirical insights into their documentation practices and challenges. Our analysis reveals that despite existing resources, developers of GenAI models still face multiple layers of uncertainties in their model documentation: (1) uncertainties about what specific content should be included; (2) uncertainties about how to effectively report key components of their models; and (3) uncertainties in deciding who should take responsibilities for various aspects of model documentation. Based on our findings, we discuss the implications for policymakers, open-source platforms, and the research community to support meaningful, effective and actionable model documentation in the GenAI era, including cultivating better community norms, building robust evaluation infrastructures, and clarifying roles and responsibilities.

Navigating Uncertainties: Understanding How GenAI Developers Document Their Models on Open-Source Platforms

TL;DR

This study investigates how GenAI developers document models on open-source platforms through 13 semi-structured interviews analyzed with reflexive thematic analysis. It identifies three core uncertainty dimensions—what content to document, how to document, and who should be responsible—arising from tension between traditional OSS norms, market pressures, and responsible AI guidelines. The findings illuminate motivations (reuse/maintenance, promotion, ethics) and pervasive ambiguities across content, process, and governance, with implications for policy, platform governance, and research toward better evaluation infrastructures and clearer roles. The work highlights the need for sociotechnical tooling, community norms, and clearer accountability to enable meaningful, actionable GenAI documentation in open-source ecosystems.

Abstract

Model documentation plays a crucial role in promoting transparency and responsible development of AI systems. With the rise of Generative AI (GenAI), open-source platforms have increasingly become hubs for hosting and distributing these models, prompting platforms like Hugging Face to develop dedicated model documentation guidelines that align with responsible AI principles. Despite these growing efforts, there remains a lack of understanding of how developers document their GenAI models on open-source platforms. Through interviews with 13 GenAI developers active on open-source platforms, we provide empirical insights into their documentation practices and challenges. Our analysis reveals that despite existing resources, developers of GenAI models still face multiple layers of uncertainties in their model documentation: (1) uncertainties about what specific content should be included; (2) uncertainties about how to effectively report key components of their models; and (3) uncertainties in deciding who should take responsibilities for various aspects of model documentation. Based on our findings, we discuss the implications for policymakers, open-source platforms, and the research community to support meaningful, effective and actionable model documentation in the GenAI era, including cultivating better community norms, building robust evaluation infrastructures, and clarifying roles and responsibilities.

Paper Structure

This paper contains 36 sections, 2 tables.