ReadMe.LLM: A Framework to Help LLMs Understand Your Library
Sandya Wijaya, Jacob Bolano, Alejandro Gomez Soteres, Shriyanshu Kode, Yue Huang, Anant Sahai
TL;DR
The paper addresses the challenge that LLMs struggle to correctly utilize niche software libraries due to underrepresented, human-oriented documentation. It introduces ReadMe.LLM, an LLM-oriented, XML-structured documentation format that libraries attach to their codebases to guide code generation. Across five LLMs and two libraries, ReadMe.LLM contexts yielded substantial improvements in code-generation accuracy, achieving near-perfect performance and up to $100\%$ in several cases, with strong generalization in held-out tests. The findings suggest ReadMe.LLM can democratize access to smaller libraries by standardizing machine-friendly context and enable smoother integration with AI agents and tools, while outlining future work on API/tool-use extensions and editor integrations.
Abstract
Large Language Models (LLMs) often struggle with code generation tasks involving niche software libraries. Existing code generation techniques with only human-oriented documentation can fail -- even when the LLM has access to web search and the library is documented online. To address this challenge, we propose ReadMe$.$LLM, LLM-oriented documentation for software libraries. By attaching the contents of ReadMe$.$LLM to a query, performance consistently improves to near-perfect accuracy, with one case study demonstrating up to 100% success across all tested models. We propose a software development lifecycle where LLM-specific documentation is maintained alongside traditional software updates. In this study, we present two practical applications of the ReadMe$.$LLM idea with diverse software libraries, highlighting that our proposed approach could generalize across programming domains.
