API

API Overview

The ScienceStack API provides programmatic access to 300k+ parsed arXiv papers with structured data, semantic search, and citation graphs. Our index includes all arXiv papers from 2025 onwards, plus highly-cited papers from earlier years.

Base URL: https://sciencestack.ai/api/v1

Why ScienceStack?

FeatureScienceStackPDF + OCR
Parsed from LaTeX source
Pre-parsed, low-latency access
Structured AST (stable node IDs)
Clean LaTeX equations
AI summaries (paper + section)
Direct figure URLs + dimensions
Tables with structured data
Export to Markdown / LaTeX / text

Quick Start

# 1. Get your API key from https://sciencestack.ai/settings/api

# 2. Search for papers (free, no quota)
curl "https://sciencestack.ai/api/v1/search?q=attention+mechanisms" \
  -H "x-api-key: sk_live_your_key_here"

# 3. Browse recent papers in a field (free, no quota)
curl "https://sciencestack.ai/api/v1/papers?field=machine-learning&sort=recent" \
  -H "x-api-key: sk_live_your_key_here"

# 4. Preview a paper (free, no quota)
curl "https://sciencestack.ai/api/v1/papers/1706.03762/overview" \
  -H "x-api-key: sk_live_your_key_here"

# 5. Get full content as markdown (uses quota)
curl "https://sciencestack.ai/api/v1/papers/1706.03762/content?format=markdown" \
  -H "x-api-key: sk_live_your_key_here"

# 6. Get all equations with LaTeX (uses quota)
curl "https://sciencestack.ai/api/v1/papers/1706.03762/nodes?types=equation&format=markdown" \
  -H "x-api-key: sk_live_your_key_here"

# 7. Get multiple specific nodes in one request (uses quota)
curl "https://sciencestack.ai/api/v1/papers/1706.03762/nodes?nodeIds=eq:1,eq:2,fig:1&format=markdown" \
  -H "x-api-key: sk_live_your_key_here"

Tip for agents: Start with /overview to get TOC, summaries, and lists of all figures/tables/equations in one call — then drill down to specific elements via /nodes?nodeIds=eq:1,fig:2.

Available Endpoints

Discovery (Free)

EndpointDescription
GET /searchSearch papers by keywords, author, or arXiv ID
GET /papersBrowse papers by field or arXiv category
GET /papers/{id}/overviewGet paper metadata, TOC, and element lists
GET /papers/overview?ids=Batch overview for multiple papers (Pro+)

Content (Uses Quota)

EndpointDescription
GET /papers/{id}/contentFull paper content (markdown, latex, or raw AST)
GET /papers/{id}/nodesFetch specific sections, equations, figures by ID or type
GET /papers/{id}/referencesBibliography with Semantic Scholar enrichment
GET /papers/{id}/citationsPapers that cite this paper

User

EndpointDescription
GET /me/papersList your uploaded papers
GET /me/usageCheck your API usage and limits

Rate Limits

API access is limited by unique papers per month:

TierPapers/monthRequests/min
Free1030
Pro20060
Team1,000120
  • /overview and /search are free (no quota consumed)
  • /papers/overview?ids= batch endpoint available for Pro+ (up to 10 papers per request)
  • Once you access a paper, unlimited requests to that paper for the rest of the month
  • Limits reset on the 1st of each month (UTC)

Output Formats

Use the format query parameter on /content and /nodes:

  • markdown (default) — Clean markdown, ideal for LLM context
  • latex — Reconstructed LaTeX source
  • raw — JSON AST with full token structure (Pro+)

Token AST (Raw Format)

When you request format=raw, you get the paper as a token tree — a hierarchical AST parsed directly from LaTeX:

{
  "type": "section",
  "numbering": "3",
  "title": [{"type": "text", "content": "Methods"}],
  "content": [
    {"type": "text", "content": "We propose..."},
    {"type": "equation", "content": "E = mc^2", "display": "block", "numbering": "1"}
  ]
}

Each token has a type field — see the API Reference for the full structure of each type.

Next Steps

    API Overview