CrawlioCrawlio Docs

MCP Overview

What you can do

Crawlio gives your AI access to ~362 tools across crawling, browser automation, and site analysis. You never see all of them. The aggregator loads context for the current task. Your AI's context window stays clean.

With Crawlio MCP, your AI can:

  • Crawl sites. Start, pause, resume, and monitor website downloads.
  • Analyze pages. Detect frameworks, extract structured data, run SEO checks.
  • Automate browsers. Navigate, click, screenshot, and extract from live pages.
  • Export results. Save as ZIP, WARC, single HTML, or deploy-ready bundles.
  • Manage auth sessions. Log into sites once. Crawlio stores encrypted sessions that all 3 pillars share.

How it works

Three MCP servers work together behind one interface. You install once and interact through your AI client.

The aggregator sits in front. It exposes 5 meta-tools. When your AI calls crawlio_discover, it gets back only the tool schemas relevant to its current task. When it calls crawlio_call, the aggregator routes to the correct server. Your AI never manages connections to individual servers.

The 3 pillars

Pillar Name What it does Tools
1 Chrome Extension Live browser automation via CDP ~114
2 Headless Agent Background browser, converters, interceptor ~199
3 Crawlio App Engine, export, intel, OCR, vault 49

Total: ~362 tools. Your AI sees 5.

The 5 meta-tools

The aggregator exposes these to your AI client:

Meta-tool What it does
crawlio_discover List available tools across all pillars. Returns only what matches the current task.
crawlio_call Route a tool call to the correct pillar.
crawlio_do Execute a high-level task. The aggregator picks which pillar handles it.
crawlio_cortex Query intelligence data across pillar boundaries.
crawlio_consult Multi-pillar consultation for complex tasks.

SessionVault

Log into sites once. Crawlio stores encrypted sessions. All 3 pillars can use them. Your AI handles authenticated crawls without seeing credentials.

Vault tool What it does
vault_list_domains List domains with stored sessions
vault_get_session Retrieve a session for authenticated crawling
vault_mark_expired Mark a session as expired
vault_delete Delete a stored session
vault_request_login Open the auth browser so you can log in

The encryption key stays in Pillar 3. Access is token-gated. Every session access is audit-logged.

Supported clients

Client Transport Status
Claude Code stdio Full support
Claude Desktop stdio Full support
Cursor stdio Full support
Windsurf stdio Full support
ChatGPT Desktop stdio Full support
VS Code (Copilot) stdio Full support
Gemini CLI stdio Full support
Zed stdio Full support
GitHub Copilot CLI stdio Full support
Any MCP client stdio Standard protocol

Run crawlio-mcp init to auto-detect and configure all installed clients.

Example workflow

You:      "Download the Stripe docs and export as WARC"
Your AI:   Calls crawlio_do("crawl https://docs.stripe.com")
           Monitors progress automatically
           Calls crawlio_call("export_site", {format: "warc", destinationPath: "~/stripe-docs.warc"})
           "Done. 1,247 pages archived to stripe-docs.warc"
 
You:      "Which pages failed?"
Your AI:   Calls crawlio_call("get_failed_urls")
           "12 pages failed. Mostly rate-limited API reference pages."
 
You:      "Recrawl them"
Your AI:   Calls crawlio_call("recrawl_urls", {urls: [...]})
           "Re-crawling 12 URLs."

Direct access (without the aggregator)

You can also connect to Pillar 3 directly. In this mode, you get 49 tools (full mode) or 6 tools (code mode) without the aggregator layer.

Mode Tools Schema tokens Use case
Code (default) 6 ~1,200 CLI pipelines, token-constrained agents
Full (--full) 49 ~5,500 Interactive use, GUI clients, debugging

Code mode compresses 49 tools into 6 using a search-and-execute pattern. See Code Mode for details.

Next steps

© 2026 Crawlio. All rights reserved.