Table of Contents
Fetching ...

Linting Style and Substance in READMEs

Hima Mynampaty, Nathania Josephine, Katherine E. Isaacs, Andrew M. McNutt

TL;DR

The design of LintMe, a design probe for linting more complex documentation and other culturally mediated text-based documents, is found to be approachable, flexible, and well matched with the needs of this domain.

Abstract

READMEs shape first impressions of software projects, yet what constitutes a good README varies across audiences and contexts. Research software needs reproducibility details, while open-source libraries might prioritize quick-start guides. Through a design probe, LintMe, we explore how linting can be used to improve READMEs given these diverse contexts, aiding style and content issues while preserving authorial agency. Users create context-specific checks using a lightweight DSL that uses a novel combination of programmatic operations (e.g., for broken links) with LLM-based content evaluation (e.g., for detecting jargon), yielding checks that would be challenging for prior linters. Through a user study (N=11), comparison with naive LLM usage, and an extensibility case study, we find that our design is approachable, flexible, and well matched with the needs of this domain. This work opens the door for linting more complex documentation and other culturally mediated text-based documents.

Linting Style and Substance in READMEs

TL;DR

The design of LintMe, a design probe for linting more complex documentation and other culturally mediated text-based documents, is found to be approachable, flexible, and well matched with the needs of this domain.

Abstract

READMEs shape first impressions of software projects, yet what constitutes a good README varies across audiences and contexts. Research software needs reproducibility details, while open-source libraries might prioritize quick-start guides. Through a design probe, LintMe, we explore how linting can be used to improve READMEs given these diverse contexts, aiding style and content issues while preserving authorial agency. Users create context-specific checks using a lightweight DSL that uses a novel combination of programmatic operations (e.g., for broken links) with LLM-based content evaluation (e.g., for detecting jargon), yielding checks that would be challenging for prior linters. Through a user study (N=11), comparison with naive LLM usage, and an extensibility case study, we find that our design is approachable, flexible, and well matched with the needs of this domain. This work opens the door for linting more complex documentation and other culturally mediated text-based documents.
Paper Structure (42 sections, 7 figures, 2 tables)

This paper contains 42 sections, 7 figures, 2 tables.

Figures (7)

  • Figure 1: Prior tools have explored different approaches to linting markdown files in general and in READMEs specifically. However, most tools have focused on either markdown style (e.g., correct markdown usage) or on natural language style (e.g., avoiding hateful terms). Here we draw a sample of representative lints from across tang2023evaluating's tang2023evaluating documentation typology (denoted in purple ) for a variety of linters. While there are other linters and other lints, this collection offers a representative sample of the type of functionality available. Additional details about this coding are available in the supplementary materials.
  • Figure 2: The LintMe playground supports lint rule usage as well as development and customization.
  • Figure 3: Each LintMe lint is a configurable pipeline of composable operators, each operator possessing a unique typing. For instance, extract targets any mdast AST node (e.g., delete being styled text) scoped to one of several regions, and returns a list of results.
  • Figure 4: Our playground includes a variety of different types of support for authoring or customizing lints, including inline documentation (E) and automatic rule generation (B).
  • Figure 5: LintMe can identify (i.e., blamemcnutt2024mixing) the specific line or text chunk that caused a specific error. If a specific item can not be identified, then the warning is placed at the top of the file. Like lint errors in code, this error can be ignored, or an attempt can be made to automatically fix it.
  • ...and 2 more figures