Table of Contents
Fetching ...

Literate Tracing

Matthew Sotoudeh

TL;DR

This paper addresses the challenge of understanding and modifying large, complex software systems by introducing literate tracing, a documentation paradigm built around annotated, concrete execution traces. It presents TReX, a tool that integrates HTML/LaTeX frontends, GDB-backed program tracing, and Python visualizations to produce faithful, interactive traces that reflect real code execution. The authors demonstrate multi-level visualizations on major systems (e.g., Linux kernel, Git, GCC) and provide a practical methodology for constructing traces, including modes, workflows, and an affirmative set of lessons learned. The work offers a pathway to more effective software comprehension and collaboration by making runtime behavior tangible through shareable, stateful traces that link high-level design to low-level implementation. Overall, literate tracing with TReX promises to enhance education, software archaeology, and maintenance of large codebases by producing faithful, explorable narratives of program execution.

Abstract

As computer systems grow ever larger and more complex, a crucial task in software development is for one person (the system expert) to communicate to another (the system novice) how a certain program works. This paper reports on the author's experiences with a paradigm for program documentation that we call literate tracing. A literate trace explains a software system using annotated, concrete execution traces of the system. Literate traces complement both in-code comments (which often lack global context) and out-of-band design docs (which often lack a concrete connection to the code). We also describe TReX, our tool for making literate traces that are interactive, visual, and guaranteed by construction to be faithful to the program semantics. We have used TReX to write literate traces explaining components of large systems software including the Linux kernel, Git source control system, and GCC compiler.

Literate Tracing

TL;DR

This paper addresses the challenge of understanding and modifying large, complex software systems by introducing literate tracing, a documentation paradigm built around annotated, concrete execution traces. It presents TReX, a tool that integrates HTML/LaTeX frontends, GDB-backed program tracing, and Python visualizations to produce faithful, interactive traces that reflect real code execution. The authors demonstrate multi-level visualizations on major systems (e.g., Linux kernel, Git, GCC) and provide a practical methodology for constructing traces, including modes, workflows, and an affirmative set of lessons learned. The work offers a pathway to more effective software comprehension and collaboration by making runtime behavior tangible through shareable, stateful traces that link high-level design to low-level implementation. Overall, literate tracing with TReX promises to enhance education, software archaeology, and maintenance of large codebases by producing faithful, explorable narratives of program execution.

Abstract

As computer systems grow ever larger and more complex, a crucial task in software development is for one person (the system expert) to communicate to another (the system novice) how a certain program works. This paper reports on the author's experiences with a paradigm for program documentation that we call literate tracing. A literate trace explains a software system using annotated, concrete execution traces of the system. Literate traces complement both in-code comments (which often lack global context) and out-of-band design docs (which often lack a concrete connection to the code). We also describe TReX, our tool for making literate traces that are interactive, visual, and guaranteed by construction to be faithful to the program semantics. We have used TReX to write literate traces explaining components of large systems software including the Linux kernel, Git source control system, and GCC compiler.

Paper Structure

This paper contains 34 sections, 4 figures.

Figures (4)

  • Figure 1: Literate tracing with the TReX LaTeX package
  • Figure 2: Literate tracing with the TReX HTML preprocessor
  • Figure 3: This singleStepper command records a frame every time the program reaches either (1) any line in the rbtree.c file, or (2) lines 63-87 of the rbtree_augmented.h file. For each of those steps, it will output a single frame with the visualization computed by the custom printProcTree command.
  • Figure 4: Custom commands can be defined in Python and imported using the trexInitialize command. Here we show how the built-in GDBEval module can be implemented.