JIT Context

The problem

Crawlio has ~362 tools across 3 MCP servers. Loading all of them into your AI's context window would consume over 15,000 tokens on schema alone, before any work begins. Attention degrades as tool count grows. Your AI spends reasoning capacity on tool selection instead of task execution.

The solution

The aggregator never dumps all ~362 tool schemas into the context window. Instead, 5 meta-tools are always present. When your AI calls crawlio_discover, it gets back only the tool schemas relevant to its current task. Everything else stays off the context window.

Without JIT context:
  362 tool schemas loaded → ~15,000 tokens consumed → attention diluted
 
With JIT context:
  5 meta-tool schemas loaded → ~800 tokens consumed → tools loaded on demand

How discovery works

Your AI calls crawlio_discover with a description of what it needs:

crawlio_discover("I need to crawl a site and export as WARC")
--> Returns schemas for: start_crawl, get_crawl_status, export_site, get_export_status

crawlio_discover("I need to capture a page and check its framework")
--> Returns schemas for: trigger_capture, get_enrichment, get_tech_stack

The aggregator searches across all 3 pillars and returns only matching tool schemas. Your AI learns the parameters on demand, then calls crawlio_call or crawlio_do to execute.

How routing works

crawlio_call

Routes a tool call to the correct pillar:

crawlio_call("start_crawl", { url: "https://example.com" })
--> Routed to Pillar 3 (Crawlio App)
 
crawlio_call("browser_navigate", { url: "https://example.com" })
--> Routed to Pillar 1 (Chrome Extension) or Pillar 2 (Headless Agent)

Your AI does not need to know which pillar owns a tool. The aggregator handles routing.

crawlio_do

Handles high-level tasks with automatic pillar selection:

crawlio_do("capture this page")
--> Uses Chrome Extension if a tab is connected, Headless Agent if not

crawlio_do picks the best pillar based on current session state. If the Chrome extension has an active tab, it uses that. If not, it falls back to the headless agent. For tasks that only Pillar 3 handles (crawl control, export, vault), it routes there directly.

crawlio_cortex

Queries intelligence data across pillar boundaries:

crawlio_cortex("what technologies does this site use?")
--> Combines tech stack data from enrichment, browser detection, and crawl analysis

crawlio_consult

Multi-pillar consultation for complex tasks that need data from multiple sources.

Routing rules

Session-sticky browser routing

The aggregator prefers the Chrome extension when a tab is connected. If no tab is connected, browser commands go to the headless agent. This is automatic. Your AI does not choose.

Browser connected?	Browser tools route to
Yes (tab connected)	Pillar 1 (Chrome Extension)
No	Pillar 2 (Headless Agent)

Tool deduplication

Pillars 1 and 2 both have browser automation tools. The aggregator exposes overlapping tools once and routes per session state. Your AI sees browser_navigate, not pillar1_browser_navigate and pillar2_browser_navigate.

Crawlio App tools are additive

Pillar 3 tools (crawl control, export, vault, intelligence, OCR) do not overlap with Pillar 1 or 2. They are always routed to Pillar 3.

Pillar resolution order

For any tool call, the aggregator resolves in this order:

Exact match in Pillar 3. If the tool name matches a Crawlio App tool, route there.
Browser tool with active session. If the tool is a browser command and a Chrome tab is connected, route to Pillar 1.
Browser tool without session. Route to Pillar 2 (Headless Agent).
Unknown tool. Return an error with available alternatives from crawlio_discover.

Skills reduce context further

Skills encode domain knowledge that your AI would otherwise need to figure out. Instead of your AI discovering which tools to use, composing them, and handling edge cases, a skill provides a tested workflow.

Five skills are installed by crawlio-mcp init:

Skill	What it does
`crawlio-mcp`	Full tool reference with parameters and examples
`crawl-site`	Intelligent crawl workflow: start, monitor, adjust, export
`audit-site`	Multi-phase site audit: crawl, capture, enrich, analyze, report
`observe`	Query the observation log with filters
`finding`	Create evidence-backed findings from observations

Skills work with both the aggregator and direct Pillar 3 access. They reduce what your AI needs to reason about by providing pre-built sequences.

The browser-side execution runtime

On the browser side, JIT context goes deeper. The execute tool runs JavaScript in a sandbox with framework-aware instrumentation injected at runtime.

Framework detection

Before your AI's code executes, the runtime probes the browser for framework signatures. Based on what it finds, it constructs a polymorphic smart object with the appropriate namespace methods:

Tier	Frameworks
Core	React, Vue, Angular, Svelte
Meta-frameworks	Next.js, Nuxt, Remix, Gatsby
E-commerce	Shopify, WooCommerce
Content systems	WordPress, Drupal, Laravel, Django
Libraries	Redux, Alpine.js, jQuery

If the target page runs React, your AI's script gains smart.react with methods to read the React devtools hook, query component trees, and extract rendered state. If the page is a Shopify storefront, smart.shopify appears with cart state and shop configuration. The runtime detects the environment and shapes the SDK to match.

Execution context

Scripts run via execute land in a scope with:

Variable	Description
`bridge`	Send CDP commands to the browser
`crawlio`	HTTP client for Crawlio App endpoints
`smart`	7 core + 17 higher-order methods + up to 17 framework namespaces
`sleep`	Async wait (max 30s per call)
`ocrScreenshot`	macOS Vision OCR

Actionability checks

When your AI calls smart.click(selector), the runtime does not immediately dispatch a click. It polls until the element is ready: it must exist, have non-zero dimensions, be CSS-visible, be enabled, and not be obscured by overlapping elements. If the element is not ready within the timeout budget, the runtime returns a structured error explaining why.

Persistent browser state

The runtime maintains a persistent connection to the browser. If a script fails, the browser is still in the exact same state. Your AI reads the error, adjusts its approach, and calls execute again against the same DOM. No context is lost between cycles.

Direct access vs aggregator

You can use Crawlio MCP in two ways:

Approach	What your AI sees	Best for
Aggregator	5 meta-tools, JIT context loading	Multi-pillar workflows, browser + crawl
Direct Pillar 3	6 code-mode tools or 49 full-mode tools	Crawl-only workflows, simpler setup

Both approaches give full access to Pillar 3 capabilities. The aggregator adds cross-pillar routing and browser access.

Next steps

MCP Overview: the 3-pillar architecture
Code Mode: direct Pillar 3 access with 6 tools
Method Mode: composite tools for multi-step workflows
Tool Reference: all 49 Crawlio App tools
Browser Agent: Chrome automation setup

PreviousEvidence Mode NextTool Reference