Table of Contents
Fetching ...

Automatic Generation of Model and Data Cards: A Step Towards Responsible AI

Jiarui Liu, Wenkai Li, Zhijing Jin, Mona Diab

TL;DR

The paper addresses the lack of standardized documentation for the rapid proliferation of open-source models and datasets by introducing CardGen, an automatic card-generation pipeline, and CardBench, a large-scale card-dataset. CardGen employs a two-stage retrieval-and-generation approach to produce model and data cards from source papers and READMEs, using structured prompts and role-specific LLMs to enhance completeness, objectivity, and faithfulness. The CardBench corpus (consisting of thousands of model/data cards) enables robust evaluation, including both automatic metrics (ROUGE, BERTScore, BARTScore, NLI) and GPT-based faithfulness assessments, complemented by human judgments. The results show CardGen can outperform human-generated cards on several quality facets, though humans retain advantages in accuracy and reference quality, highlighting the importance of human validation alongside automatic generation for responsible AI documentation. The work advances accountable AI by providing scalable, consistent, and transparent documentation for models and datasets, while acknowledging limitations such as potential hallucinations and the need for iterative improvements and safeguards in future work.

Abstract

In an era of model and data proliferation in machine learning/AI especially marked by the rapid advancement of open-sourced technologies, there arises a critical need for standardized consistent documentation. Our work addresses the information incompleteness in current human-generated model and data cards. We propose an automated generation approach using Large Language Models (LLMs). Our key contributions include the establishment of CardBench, a comprehensive dataset aggregated from over 4.8k model cards and 1.4k data cards, coupled with the development of the CardGen pipeline comprising a two-step retrieval process. Our approach exhibits enhanced completeness, objectivity, and faithfulness in generated model and data cards, a significant step in responsible AI documentation practices ensuring better accountability and traceability.

Automatic Generation of Model and Data Cards: A Step Towards Responsible AI

TL;DR

The paper addresses the lack of standardized documentation for the rapid proliferation of open-source models and datasets by introducing CardGen, an automatic card-generation pipeline, and CardBench, a large-scale card-dataset. CardGen employs a two-stage retrieval-and-generation approach to produce model and data cards from source papers and READMEs, using structured prompts and role-specific LLMs to enhance completeness, objectivity, and faithfulness. The CardBench corpus (consisting of thousands of model/data cards) enables robust evaluation, including both automatic metrics (ROUGE, BERTScore, BARTScore, NLI) and GPT-based faithfulness assessments, complemented by human judgments. The results show CardGen can outperform human-generated cards on several quality facets, though humans retain advantages in accuracy and reference quality, highlighting the importance of human validation alongside automatic generation for responsible AI documentation. The work advances accountable AI by providing scalable, consistent, and transparent documentation for models and datasets, while acknowledging limitations such as potential hallucinations and the need for iterative improvements and safeguards in future work.

Abstract

In an era of model and data proliferation in machine learning/AI especially marked by the rapid advancement of open-sourced technologies, there arises a critical need for standardized consistent documentation. Our work addresses the information incompleteness in current human-generated model and data cards. We propose an automated generation approach using Large Language Models (LLMs). Our key contributions include the establishment of CardBench, a comprehensive dataset aggregated from over 4.8k model cards and 1.4k data cards, coupled with the development of the CardGen pipeline comprising a two-step retrieval process. Our approach exhibits enhanced completeness, objectivity, and faithfulness in generated model and data cards, a significant step in responsible AI documentation practices ensuring better accountability and traceability.
Paper Structure (52 sections, 12 figures, 18 tables)

This paper contains 52 sections, 12 figures, 18 tables.

Figures (12)

  • Figure 1: Common problems with manually generated model cards and data cards.
  • Figure 2: Overview of the CardGen pipeline to generate a full model card or a full data card.
  • Figure 3: bert-base-uncaseddevlin2018bert as a current model card example with a unique disclaimer sentence, indicating a modification by the HF team.
  • Figure 4: Prompts for calling GPT3.5 to select direct paper links. We prepend one positive example and one negative example to the message list to improve its inference quality.
  • Figure 5: The task taxonomy of models in the model cards dataset (left), and the task taxonomy of datasets in the dataset cards dataset (right), with the inner circle as the test set, and the outer circle as the whole set.
  • ...and 7 more figures