API Overview
The ScienceStack API provides programmatic access to 300k+ parsed arXiv papers with structured data, semantic search, and citation graphs. Our index includes all arXiv papers from 2025 onwards, plus highly-cited papers from earlier years.
Base URL: https://sciencestack.ai/api/v1
Why ScienceStack?
| Feature | ScienceStack | PDF + OCR |
|---|---|---|
| Parsed from LaTeX source | ✓ | ✗ |
| Pre-parsed, low-latency access | ✓ | ✗ |
| Structured AST (stable node IDs) | ✓ | ✗ |
| Clean LaTeX equations | ✓ | ✗ |
| AI summaries (paper + section) | ✓ | ✗ |
| Direct figure URLs + dimensions | ✓ | ✗ |
| Tables with structured data | ✓ | ✗ |
| Export to Markdown / LaTeX / text | ✓ | ✗ |
Quick Start
# 1. Get your API key from https://sciencestack.ai/settings/api
# 2. Search for papers (free, no quota)
curl "https://sciencestack.ai/api/v1/search?q=attention+mechanisms" \
-H "x-api-key: sk_live_your_key_here"
# 3. Browse recent papers in a field (free, no quota)
curl "https://sciencestack.ai/api/v1/papers?field=machine-learning&sort=recent" \
-H "x-api-key: sk_live_your_key_here"
# 4. Preview a paper (free, no quota)
curl "https://sciencestack.ai/api/v1/papers/1706.03762/overview" \
-H "x-api-key: sk_live_your_key_here"
# 5. Get full content as markdown (uses quota)
curl "https://sciencestack.ai/api/v1/papers/1706.03762/content?format=markdown" \
-H "x-api-key: sk_live_your_key_here"
# 6. Get all equations with LaTeX (uses quota)
curl "https://sciencestack.ai/api/v1/papers/1706.03762/nodes?types=equation&format=markdown" \
-H "x-api-key: sk_live_your_key_here"
# 7. Get multiple specific nodes in one request (uses quota)
curl "https://sciencestack.ai/api/v1/papers/1706.03762/nodes?nodeIds=eq:1,eq:2,fig:1&format=markdown" \
-H "x-api-key: sk_live_your_key_here"Tip for agents: Start with /overview to get TOC, summaries, and lists of all figures/tables/equations in one call — then drill down to specific elements via /nodes?nodeIds=eq:1,fig:2.
Available Endpoints
Discovery (Free)
| Endpoint | Description |
|---|---|
GET /search | Search papers by keywords, author, or arXiv ID |
GET /papers | Browse papers by field or arXiv category |
GET /papers/{id}/overview | Get paper metadata, TOC, and element lists |
GET /papers/overview?ids= | Batch overview for multiple papers (Pro+) |
Content (Uses Quota)
| Endpoint | Description |
|---|---|
GET /papers/{id}/content | Full paper content (markdown, latex, or raw AST) |
GET /papers/{id}/nodes | Fetch specific sections, equations, figures by ID or type |
GET /papers/{id}/references | Bibliography with Semantic Scholar enrichment |
GET /papers/{id}/citations | Papers that cite this paper |
User
| Endpoint | Description |
|---|---|
GET /me/papers | List your uploaded papers |
GET /me/usage | Check your API usage and limits |
Rate Limits
API access is limited by unique papers per month:
| Tier | Papers/month | Requests/min |
|---|---|---|
| Free | 10 | 30 |
| Pro | 200 | 60 |
| Team | 1,000 | 120 |
/overviewand/searchare free (no quota consumed)/papers/overview?ids=batch endpoint available for Pro+ (up to 10 papers per request)- Once you access a paper, unlimited requests to that paper for the rest of the month
- Limits reset on the 1st of each month (UTC)
Output Formats
Use the format query parameter on /content and /nodes:
markdown(default) — Clean markdown, ideal for LLM contextlatex— Reconstructed LaTeX sourceraw— JSON AST with full token structure (Pro+)
Token AST (Raw Format)
When you request format=raw, you get the paper as a token tree — a hierarchical AST parsed directly from LaTeX:
{
"type": "section",
"numbering": "3",
"title": [{"type": "text", "content": "Methods"}],
"content": [
{"type": "text", "content": "We propose..."},
{"type": "equation", "content": "E = mc^2", "display": "block", "numbering": "1"}
]
}Each token has a type field — see the API Reference for the full structure of each type.
Next Steps
- Authentication — Get your API key
- Examples — Common use cases and code samples
- API Reference — Full endpoint documentation