API Overview

The ScienceStack API provides programmatic access to 300k+ parsed arXiv papers with structured data, semantic search, and citation graphs. Our index includes all arXiv papers from 2025 onwards, plus highly-cited papers from earlier years.

Base URL: https://sciencestack.ai/api/v1

Why ScienceStack?

Feature	ScienceStack	PDF + OCR
Parsed from LaTeX source	✓	✗
Pre-parsed, low-latency access	✓	✗
Structured AST (stable node IDs)	✓	✗
Clean LaTeX equations	✓	✗
AI summaries (paper + section)	✓	✗
Direct figure URLs + dimensions	✓	✗
Tables with structured data	✓	✗
Export to Markdown / LaTeX / text	✓	✗

Quick Start

# 1. Get your API key from https://sciencestack.ai/settings/api

# 2. Search for papers (free, no quota)
curl "https://sciencestack.ai/api/v1/search?q=attention+mechanisms" \
  -H "x-api-key: sk_live_your_key_here"

# 3. Browse recent papers in a field (free, no quota)
curl "https://sciencestack.ai/api/v1/papers?field=machine-learning&sort=recent" \
  -H "x-api-key: sk_live_your_key_here"

# 4. Preview a paper (free, no quota)
curl "https://sciencestack.ai/api/v1/papers/1706.03762/overview" \
  -H "x-api-key: sk_live_your_key_here"

# 5. Get full content as markdown (uses quota)
curl "https://sciencestack.ai/api/v1/papers/1706.03762/content?format=markdown" \
  -H "x-api-key: sk_live_your_key_here"

# 6. Get all equations with LaTeX (uses quota)
curl "https://sciencestack.ai/api/v1/papers/1706.03762/nodes?types=equation&format=markdown" \
  -H "x-api-key: sk_live_your_key_here"

# 7. Get multiple specific nodes in one request (uses quota)
curl "https://sciencestack.ai/api/v1/papers/1706.03762/nodes?nodeIds=eq:1,eq:2,fig:1&format=markdown" \
  -H "x-api-key: sk_live_your_key_here"

Tip for agents: Start with /overview to get TOC, summaries, and lists of all figures/tables/equations in one call — then drill down to specific elements via /nodes?nodeIds=eq:1,fig:2.

Available Endpoints

Discovery (Free)

Endpoint	Description
`GET /search`	Search papers by keywords, author, or arXiv ID
`GET /papers`	Browse papers by field or arXiv category
`GET /papers/{id}/overview`	Get paper metadata, TOC, and element lists
`GET /papers/overview?ids=`	Batch overview for multiple papers (Pro+)

Content (Uses Quota)

Endpoint	Description
`GET /papers/{id}/content`	Full paper content (markdown, latex, or raw AST)
`GET /papers/{id}/nodes`	Fetch specific sections, equations, figures by ID or type
`GET /papers/{id}/references`	Bibliography with Semantic Scholar enrichment
`GET /papers/{id}/citations`	Papers that cite this paper

User

Endpoint	Description
`GET /me/papers`	List your uploaded papers
`GET /me/usage`	Check your API usage and limits

Rate Limits

API access is limited by unique papers per month:

Tier	Papers/month	Requests/min
Free	10	30
Pro	200	60
Team	1,000	120

/overview and /search are free (no quota consumed)
/papers/overview?ids= batch endpoint available for Pro+ (up to 10 papers per request)
Once you access a paper, unlimited requests to that paper for the rest of the month
Limits reset on the 1st of each month (UTC)

Output Formats

Use the format query parameter on /content and /nodes:

markdown (default) — Clean markdown, ideal for LLM context
latex — Reconstructed LaTeX source
raw — JSON AST with full token structure (Pro+)

Token AST (Raw Format)

When you request format=raw, you get the paper as a token tree — a hierarchical AST parsed directly from LaTeX:

{
  "type": "section",
  "numbering": "3",
  "title": [{"type": "text", "content": "Methods"}],
  "content": [
    {"type": "text", "content": "We propose..."},
    {"type": "equation", "content": "E = mc^2", "display": "block", "numbering": "1"}
  ]
}

Each token has a type field — see the API Reference for the full structure of each type.

Next Steps

Authentication — Get your API key
Examples — Common use cases and code samples
API Reference — Full endpoint documentation

On this page