Exports
Scientific papers are typically distributed as PDFs — convenient for humans, but terrible for machines, AI, and modern workflows.
ScienceStack transforms LaTeX source into three structured export formats that preserve the full semantic content of research papers:
- Markdown (
.md) — Human-readable, works with Obsidian/Notion/VSCode, preserves all numbering - JSON (
.json) — Machine-native, optimized for LLMs and AI pipelines - LaTeX (
.tex) — Raw LaTeX with all macros expanded
All formats preserve equations, section numbers, cross-references, and document structure — making them superior to PDF extraction or generic converters.
The PDF Problem
PDFs flatten rich document structure into visual layouts, stripping away semantic meaning:
- Loss of structure — Sections, figures, theorems, and references are mashed into a page dump
- Broken math — Equations are often extracted incorrectly (superscripts and fractions collapse)
- No semantic cues — Citations appear as
[12]instead of links to actual references - Bad for AI — LLMs waste tokens on noise (line breaks, formatting artifacts)
How to Export
- Navigate to any paper on ScienceStack
- Click the Download button in the top-right navigation bar
- Select your preferred format from the dropdown
- Configure options (annotations, assets) and download
Markdown Export
Our Markdown export is purpose-built for research papers and significantly more robust than generic LaTeX→Markdown converters.
Key Features
- Complete numbering preservation — Sections, equations, figures, tables, and theorems all keep their original numbers
- Linkable cross-references — All
\ref{...}commands become live markdown links - Complete asset package (Pro) — Download with all figures and diagrams in optimized formats
- LLM-friendly annotations — Your notes are embedded as structured JSON in HTML comments
- Works everywhere — Compatible with Obsidian, Notion, VSCode, GitHub
JSON Export
Our JSON format is machine-native and optimized for AI applications, LLM ingestion, and programmatic analysis.
Why JSON Over PDFs for LLMs?
| Problem | Our JSON | |
|---|---|---|
| Math extraction | Corrupted | LaTeX preserved |
| Structure | Flattened | Full semantic tree |
| Numbering | OCR errors | All elements numbered |
Key Properties
- Macros expanded — All
\newcommanddefinitions resolved - Stable IDs — Every block has a unique identifier
- Semantic types — Explicit tags for abstracts, proofs, definitions, etc.
- Resolved references —
\ref{thm:main}links to the actual theorem block
LaTeX Export
Download the raw LaTeX source with all macros expanded and content in the correct order.
What You Get
- Macro expansion — All
\newcommand,\def, and custom commands resolved - Complete content — All
\inputand\includefiles merged in order - Clean formatting — Unnecessary whitespace and comments removed
- Bibliography included — References appended as BibTeX entries