CrawlioCrawlio Docs

CLI Overview

The Crawlio CLI is a standalone macOS binary that controls the Crawlio desktop app from the terminal. 22 command families cover crawl control, export, analysis, browser automation, and an autonomous loop that crawls, analyzes, adjusts, and re-crawls without intervention.

Prerequisites

  • macOS 13+ (Ventura or later)
  • Crawlio.app installed (the CLI connects to the app for most operations)

Install

brew install crawlio-app/tap/crawlio

After tapping once, future upgrades are just brew upgrade crawlio.

Homebrew Cask (app + CLI)

Install the full desktop app with CLI and MCP server symlinks:

brew install --cask crawlio-app/tap/crawlio

Build from source

git clone https://github.com/Crawlio-app/crawlio.git
cd crawlio
swift build -c release
cp .build/release/CrawlioCLI /usr/local/bin/crawlio

Your first crawl

crawlio example.com

Bare domains work. Crawlio adds https:// automatically. This single command:

  1. Connects to the browser agent (if running) for framework detection
  2. Runs the autonomous crawl loop with coverage targeting
  3. Exports the result as a folder when complete

For more control, use the explicit commands:

crawlio crawl start https://example.com --watch
  Crawl started for https://example.com
 
  Status:   crawling
  URL:      https://example.com
  Discovered:  24
  Downloaded:  18
  Failed:      0
  Queued:      6
  Localized:   15

The --watch flag streams live progress until the crawl completes.

Connection model

The CLI auto-detects how to talk to Crawlio using a 4-tier resolution chain:

Priority Method What happens
1 Crawlio.app Reads port from ~/Library/Logs/Crawlio/control.port, connects via HTTP. Full read/write access to all 45 endpoints.
2 Running headless Checks ~/.crawlio/headless.port for a running crawlio-headless instance. Same HTTP client.
3 Auto-launch headless Finds crawlio-headless binary (sibling to CLI, /usr/local/bin/, ~/.crawlio/bin/, or $PATH), launches it, waits up to 10s.
4 File fallback Reads state.json and crawl.jsonl from disk. Read-only: can show status and downloads, but cannot start, stop, or modify crawls.

License tiers

Commands are gated by license tier. The CLI checks the connected app or falls back to the local Keychain.

Tier Price Commands
Free $0 run, crawl, status, downloads, logs, settings, project, pipeline, observe, finding, license, version
Core $59 + export, extract, shell, loop, debug, enrichment, data
Pro $149 + intel, capture, investigate

Insufficient tier returns an error with pricing and activation instructions.

Commands

All 22 command families:

Command Tier Description
crawlio <url> / crawlio run Free Crawl, enrich, and export in one command
crawlio crawl Free Start, stop, pause, resume, recrawl
crawlio status Free Engine state and progress counters
crawlio downloads Free List downloads, show failures, render file tree
crawlio export Core Export site (folder, zip, singleHTML, warc)
crawlio extract Core Run content extraction pipeline
crawlio settings Free View and modify crawl settings and policy
crawlio project Free List, save, load, run, delete projects
crawlio logs Free Structured log viewer with filtering
crawlio loop Core Autonomous crawl-analyze-adjust loop
crawlio pipeline Free Stdio pipeline mode (JSON in, JSONL out)
crawlio investigate Pro Deep traffic interception and analysis
crawlio intel Pro Tech stack, SEO, design, keywords, duplicates
crawlio observe Free List, get, create observations
crawlio finding Free List, create findings
crawlio capture Pro WebKit runtime capture
crawlio debug Core Metrics, log level, state dump
crawlio enrichment Core Browser enrichment data (show, submit)
crawlio data Core Structured data (JSON-LD, tables, microdata, RDFa)
crawlio license Free Show tier, activate license key
crawlio shell Core Interactive REPL
crawlio version Free Version info

Global flags

Every command supports these flags:

Flag Description
--json Output as JSON (machine-readable)
--quiet Suppress banners and non-essential output
--no-color Disable ANSI colored output
--debug Enable debug output

Output modes

Terminal (default) uses colored text, aligned tables, directory trees, and live progress bars.

JSON (--json) outputs structured JSON for piping to other tools:

crawlio status --json | jq '.progress.downloaded'

Quiet (--quiet) suppresses the ASCII banner. Only the primary result prints.

Browser agent integration

The CLI auto-connects to crawlio-browser (Chrome extension MCP server at 127.0.0.1:9333) for browser-level intelligence: framework detection, network capture, console logs, DOM snapshots, and screenshots. Use --no-agent to skip this on crawlio run.

In crawlio loop, the agent is opt-in via the --agent flag.

Next steps

© 2026 Crawlio. All rights reserved.