MCP Overview
What you can do
Crawlio gives your AI access to ~362 tools across crawling, browser automation, and site analysis. You never see all of them. The aggregator loads context for the current task. Your AI's context window stays clean.
With Crawlio MCP, your AI can:
- Crawl sites. Start, pause, resume, and monitor website downloads.
- Analyze pages. Detect frameworks, extract structured data, run SEO checks.
- Automate browsers. Navigate, click, screenshot, and extract from live pages.
- Export results. Save as ZIP, WARC, single HTML, or deploy-ready bundles.
- Manage auth sessions. Log into sites once. Crawlio stores encrypted sessions that all 3 pillars share.
How it works
Three MCP servers work together behind one interface. You install once and interact through your AI client.
The aggregator sits in front. It exposes 5 meta-tools. When your AI calls crawlio_discover, it gets back only the tool schemas relevant to its current task. When it calls crawlio_call, the aggregator routes to the correct server. Your AI never manages connections to individual servers.
The 3 pillars
| Pillar | Name | What it does | Tools |
|---|---|---|---|
| 1 | Chrome Extension | Live browser automation via CDP | ~114 |
| 2 | Headless Agent | Background browser, converters, interceptor | ~199 |
| 3 | Crawlio App | Engine, export, intel, OCR, vault | 49 |
Total: ~362 tools. Your AI sees 5.
The 5 meta-tools
The aggregator exposes these to your AI client:
| Meta-tool | What it does |
|---|---|
crawlio_discover |
List available tools across all pillars. Returns only what matches the current task. |
crawlio_call |
Route a tool call to the correct pillar. |
crawlio_do |
Execute a high-level task. The aggregator picks which pillar handles it. |
crawlio_cortex |
Query intelligence data across pillar boundaries. |
crawlio_consult |
Multi-pillar consultation for complex tasks. |
SessionVault
Log into sites once. Crawlio stores encrypted sessions. All 3 pillars can use them. Your AI handles authenticated crawls without seeing credentials.
| Vault tool | What it does |
|---|---|
vault_list_domains |
List domains with stored sessions |
vault_get_session |
Retrieve a session for authenticated crawling |
vault_mark_expired |
Mark a session as expired |
vault_delete |
Delete a stored session |
vault_request_login |
Open the auth browser so you can log in |
The encryption key stays in Pillar 3. Access is token-gated. Every session access is audit-logged.
Supported clients
| Client | Transport | Status |
|---|---|---|
| Claude Code | stdio | Full support |
| Claude Desktop | stdio | Full support |
| Cursor | stdio | Full support |
| Windsurf | stdio | Full support |
| ChatGPT Desktop | stdio | Full support |
| VS Code (Copilot) | stdio | Full support |
| Gemini CLI | stdio | Full support |
| Zed | stdio | Full support |
| GitHub Copilot CLI | stdio | Full support |
| Any MCP client | stdio | Standard protocol |
Run crawlio-mcp init to auto-detect and configure all installed clients.
Example workflow
You: "Download the Stripe docs and export as WARC"
Your AI: Calls crawlio_do("crawl https://docs.stripe.com")
Monitors progress automatically
Calls crawlio_call("export_site", {format: "warc", destinationPath: "~/stripe-docs.warc"})
"Done. 1,247 pages archived to stripe-docs.warc"
You: "Which pages failed?"
Your AI: Calls crawlio_call("get_failed_urls")
"12 pages failed. Mostly rate-limited API reference pages."
You: "Recrawl them"
Your AI: Calls crawlio_call("recrawl_urls", {urls: [...]})
"Re-crawling 12 URLs."Direct access (without the aggregator)
You can also connect to Pillar 3 directly. In this mode, you get 49 tools (full mode) or 6 tools (code mode) without the aggregator layer.
| Mode | Tools | Schema tokens | Use case |
|---|---|---|---|
| Code (default) | 6 | ~1,200 | CLI pipelines, token-constrained agents |
Full (--full) |
49 | ~5,500 | Interactive use, GUI clients, debugging |
Code mode compresses 49 tools into 6 using a search-and-execute pattern. See Code Mode for details.