CLI Overview
The Crawlio CLI is a standalone macOS binary that controls the Crawlio desktop app from the terminal. 22 command families cover crawl control, export, analysis, browser automation, and an autonomous loop that crawls, analyzes, adjusts, and re-crawls without intervention.
Prerequisites
- macOS 13+ (Ventura or later)
- Crawlio.app installed (the CLI connects to the app for most operations)
Install
Homebrew (recommended)
brew install crawlio-app/tap/crawlioAfter tapping once, future upgrades are just brew upgrade crawlio.
Homebrew Cask (app + CLI)
Install the full desktop app with CLI and MCP server symlinks:
brew install --cask crawlio-app/tap/crawlioBuild from source
git clone https://github.com/Crawlio-app/crawlio.git
cd crawlio
swift build -c release
cp .build/release/CrawlioCLI /usr/local/bin/crawlioYour first crawl
crawlio example.comBare domains work. Crawlio adds https:// automatically. This single command:
- Connects to the browser agent (if running) for framework detection
- Runs the autonomous crawl loop with coverage targeting
- Exports the result as a folder when complete
For more control, use the explicit commands:
crawlio crawl start https://example.com --watch Crawl started for https://example.com
Status: crawling
URL: https://example.com
Discovered: 24
Downloaded: 18
Failed: 0
Queued: 6
Localized: 15The --watch flag streams live progress until the crawl completes.
Connection model
The CLI auto-detects how to talk to Crawlio using a 4-tier resolution chain:
| Priority | Method | What happens |
|---|---|---|
| 1 | Crawlio.app | Reads port from ~/Library/Logs/Crawlio/control.port, connects via HTTP. Full read/write access to all 45 endpoints. |
| 2 | Running headless | Checks ~/.crawlio/headless.port for a running crawlio-headless instance. Same HTTP client. |
| 3 | Auto-launch headless | Finds crawlio-headless binary (sibling to CLI, /usr/local/bin/, ~/.crawlio/bin/, or $PATH), launches it, waits up to 10s. |
| 4 | File fallback | Reads state.json and crawl.jsonl from disk. Read-only: can show status and downloads, but cannot start, stop, or modify crawls. |
License tiers
Commands are gated by license tier. The CLI checks the connected app or falls back to the local Keychain.
| Tier | Price | Commands |
|---|---|---|
| Free | $0 | run, crawl, status, downloads, logs, settings, project, pipeline, observe, finding, license, version |
| Core | $59 | + export, extract, shell, loop, debug, enrichment, data |
| Pro | $149 | + intel, capture, investigate |
Insufficient tier returns an error with pricing and activation instructions.
Commands
All 22 command families:
| Command | Tier | Description |
|---|---|---|
crawlio <url> / crawlio run |
Free | Crawl, enrich, and export in one command |
crawlio crawl |
Free | Start, stop, pause, resume, recrawl |
crawlio status |
Free | Engine state and progress counters |
crawlio downloads |
Free | List downloads, show failures, render file tree |
crawlio export |
Core | Export site (folder, zip, singleHTML, warc) |
crawlio extract |
Core | Run content extraction pipeline |
crawlio settings |
Free | View and modify crawl settings and policy |
crawlio project |
Free | List, save, load, run, delete projects |
crawlio logs |
Free | Structured log viewer with filtering |
crawlio loop |
Core | Autonomous crawl-analyze-adjust loop |
crawlio pipeline |
Free | Stdio pipeline mode (JSON in, JSONL out) |
crawlio investigate |
Pro | Deep traffic interception and analysis |
crawlio intel |
Pro | Tech stack, SEO, design, keywords, duplicates |
crawlio observe |
Free | List, get, create observations |
crawlio finding |
Free | List, create findings |
crawlio capture |
Pro | WebKit runtime capture |
crawlio debug |
Core | Metrics, log level, state dump |
crawlio enrichment |
Core | Browser enrichment data (show, submit) |
crawlio data |
Core | Structured data (JSON-LD, tables, microdata, RDFa) |
crawlio license |
Free | Show tier, activate license key |
crawlio shell |
Core | Interactive REPL |
crawlio version |
Free | Version info |
Global flags
Every command supports these flags:
| Flag | Description |
|---|---|
--json |
Output as JSON (machine-readable) |
--quiet |
Suppress banners and non-essential output |
--no-color |
Disable ANSI colored output |
--debug |
Enable debug output |
Output modes
Terminal (default) uses colored text, aligned tables, directory trees, and live progress bars.
JSON (--json) outputs structured JSON for piping to other tools:
crawlio status --json | jq '.progress.downloaded'Quiet (--quiet) suppresses the ASCII banner. Only the primary result prints.
Browser agent integration
The CLI auto-connects to crawlio-browser (Chrome extension MCP server at 127.0.0.1:9333) for browser-level intelligence: framework detection, network capture, console logs, DOM snapshots, and screenshots. Use --no-agent to skip this on crawlio run.
In crawlio loop, the agent is opt-in via the --agent flag.
Next steps
- See all commands and flags in the CLI Commands Reference
- Learn about the Interactive Shell
- Learn about the Autonomous Loop
- Connect AI tools with MCP integration