PDFJSON

PDF to JSON Converter

Extract academic PDFs to structured JSON with document hierarchy and metadata.

1 credit per page — Free tier includes 30 credits/month.

Free account required. See pricing for high-volume use.

Built for your workflow

PDF to JSON conversion that actually works for academic papers.

Data Pipelines

Process PDF paper archives into structured data for analysis or ML training

RAG Systems

Build retrieval systems over PDF papers with queryable structure

Paper Analytics

Extract and analyze equations, citations, or sections across many papers

Format Conversion

Use JSON as an intermediate format to convert PDFs to any output you need

AI-powered extraction

State-of-the-art vision models extract structure, equations, and content from any PDF layout.

AI-Powered Extraction

Uses vision models to accurately extract text, equations, and structure from PDFs

Layout Understanding

Recognizes multi-column layouts, figures, tables, and sidebars

Equation Recognition

Converts mathematical equations back to LaTeX notation

Figure Extraction

Extracts figures and diagrams with captions preserved

Structured Output

Full document tree with typed elements - section, equation, figure, table, etc.

Metadata Extraction

Extracts title, authors, abstract, and publication info when available

See what you get

Real output from converting the “Attention Is All You Need” paper.

output.json
{
  "by": "sciencestack.ai",
  "title": "Deep Residual Learning for Image Recognition",
  "abstract": "Deeper neural networks are more difficult to train...",
  "authors": ["Kaiming He", "Xiangyu Zhang", "Shaoqing Ren", "Jian Sun"],
  "document": [
    {
      "type": "section",
      "title": "Introduction",
      "content": [
        "Deep networks naturally integrate low/mid/high-level features...",
        {
          "type": "equation",
          "content": "\\mathcal{F}(x) := \\mathcal{H}(x) - x"
        }
      ]
    }
  ]
}

Equations, cross-references, and structure — all preserved.

How it works

1

Upload

Upload any academic PDF - we extract structure, equations, and metadata

2

Process

AI extracts text, equations, and structure from your PDF

3

Download

Structured JSON with document hierarchy, equations as LaTeX, and rich metadata

Simple pricing

1 credit

Per page

  • AI-powered extraction
  • Equation preservation
  • Cross-reference resolution
  • Bibliography included
  • Structured document tree
Get Started

Free tier includes 30 credits/month. View plans for more credits.

JSON export requires a Plus subscription.

Frequently Asked Questions

Ready to convert?

Upload your PDF and get structured JSON in seconds.

    PDF to JSON Converter | ScienceStack | ScienceStack