Troubleshooting
App not running (CLI or MCP can't connect)
The CLI and MCP server connect to Crawlio.app through a 4-tier fallback chain:
- Unix Domain Socket at
~/Library/Logs/Crawlio/control.sock - TCP port from
~/Library/Logs/Crawlio/control.port - State file at
~/Library/Logs/Crawlio/state.json(read-only, no control) - Headless engine at port from
~/.crawlio/headless.port
If all four fail, the app is not running.
Fix:
- Launch Crawlio.app
- Check the socket exists:
ls -la ~/Library/Logs/Crawlio/control.sock - Check the port file:
cat ~/Library/Logs/Crawlio/control.port - If both files are missing, quit and relaunch Crawlio
Stale port file
If the app crashed or was force-quit, the port file may still exist but point to a dead process.
Symptoms: CLI commands hang or return "Connection refused."
Fix:
# Check if the port is responding
PORT=$(cat ~/Library/Logs/Crawlio/control.port)
curl -s http://localhost:$PORT/health
# If no response, delete the stale file and relaunch
rm ~/Library/Logs/Crawlio/control.port
rm ~/Library/Logs/Crawlio/control.sock
# Then open Crawlio.appCrawl stuck
If a crawl appears frozen (progress counters stopped), check these causes:
Circuit breaker tripped
The engine automatically backs off from hosts returning repeated errors (429, 503, connection failures). After several consecutive failures, the host is temporarily blocked.
Check: Look at the crawl log for "circuit breaker" messages.
curl --unix-socket ~/Library/Logs/Crawlio/control.sock http://localhost/logs?level=warning&limit=20Fix: The circuit breaker resets automatically after a cooldown period. If the target server is healthy, stop and restart the crawl.
Bandwidth throttle
If you set a bandwidth limit, downloads may appear stalled when the limit is reached.
Fix: Increase or remove the bandwidth limit in Settings.
Per-host limits
The default per-host connection limit is 6. If all connections are waiting on a slow host, the crawl looks stuck.
Fix: Increase per-host connections, or add a crawl delay to spread requests over time.
Crawl delay
A high crawl delay (e.g., 5 seconds) with low concurrency makes the crawl very slow by design.
Fix: Reduce crawlDelay in settings. Values below 1.0 are typical.
MCP not connecting
Check installation
# Verify the binary exists
which crawlio-mcp
# Test basic startup
crawlio-mcp --helpCheck config file
Run crawlio-mcp init to write the correct config for your AI client. The command auto-detects installed clients (Claude Code, Claude Desktop, VS Code, Cursor, Windsurf).
crawlio-mcp initVerify the config
Check that the config file points to the correct binary path:
| Client | Config path |
|---|---|
| Claude Code | ~/.claude.json or .mcp.json |
| Claude Desktop | ~/Library/Application Support/Claude/claude_desktop_config.json |
| VS Code | ~/Library/Application Support/Code/User/mcp.json |
| Cursor | ~/.cursor/mcp.json |
| Windsurf | ~/.codeium/windsurf/mcp_config.json |
Test the connection
# Check if Crawlio.app is running and the API is accessible
curl --unix-socket ~/Library/Logs/Crawlio/control.sock http://localhost/healthIf the health check fails, the MCP server cannot reach the app. Launch Crawlio.app first.
Browser agent not connecting
The Chrome extension communicates with Crawlio through native messaging and a WebSocket bridge.
Bridge file missing
The native messaging manifest must exist at the correct path for your browser.
Fix:
# Install native messaging for all detected browsers
crawlio-mcp initOr install manually through Settings > Advanced > Browser Extension Integration.
WebSocket port conflict
If another process is using the bridge port, the connection fails silently.
Fix: Check for port conflicts in ~/Library/Logs/Crawlio/crawl.jsonl. Restart the app to rebind.
Extension not installed
Make sure the Crawlio for Chrome extension is installed and enabled in your browser.
Export fails
Disk space
Large sites can produce multi-gigabyte exports, especially WARC format with gzip disabled.
Fix: Check available disk space before exporting. Use WARC with compression enabled for the most compact output.
Permissions
The export destination directory must be writable.
Fix: Choose a destination in your home directory (e.g., ~/Downloads/ or ~/Desktop/). Avoid system directories.
Broken links in offline folder
If links are broken in the exported folder, the link localizer did not recognize a URL pattern.
Common causes:
- JavaScript-generated links. Not in the HTML source, so they cannot be rewritten. Enable WebKit mode.
- Fragment-only links (e.g.,
#section). These should work without rewriting. - Cross-domain assets on nested pages. Deep pages with CDN references may have incorrect relative paths.
WARC validation
# Check the file is not empty
wc -c export.warc.gz
# View the first few WARC records
zcat export.warc.gz | head -50WARC files conform to ISO 28500 and include SHA-1 digests, a CDX index, and deduplication. Load into ReplayWeb.page for playback verification.
"Pro required" error
Some features require a Pro license. If you see a tier-gating error, you are on the Free or Core tier.
What requires Pro
- Intelligence endpoints:
/tech-stack,/seo-findings,/design-intel,/keyword-intel,/duplicate-content - The intelligence tier in the MCP tools
- Advanced analysis features in the app
How to check your tier
curl --unix-socket ~/Library/Logs/Crawlio/control.sock http://localhost/licenseHow to activate
- Purchase a license at crawlio.app/buy/pro
- Open Settings > License in the app
- Enter your license key
- The app validates and activates immediately
Free tier limits
Free tier allows 5 crawls per week. The limit resets weekly. The /start endpoint returns 429 with a resetsAt timestamp when the limit is reached.
Slow crawl
Crawlio defaults to conservative concurrency to avoid overwhelming servers.
Speed up
# Increase parallel connections (default: 10, max: 40)
crawlio settings set settings.maxConcurrent 20
# Reduce per-host delay
crawlio settings set settings.crawlDelay 0.0Slow down (avoid rate limiting)
# Add delay between requests
crawlio settings set settings.crawlDelay 1.0
# Reduce concurrency
crawlio settings set settings.maxConcurrent 3High memory on large sites
For sites with 10,000+ pages:
- Set max pages or max depth to cap the crawl
- Disable WebKit mode if not needed (rendering uses more memory)
- Close other apps to free memory
SSL errors
Certificate validation
Crawlio validates SSL certificates by default. Self-signed or expired certificates cause download failures.
Fix for sites you own: Disable strict certificate validation in Settings > Advanced. Only do this for trusted sites.
Certificate pinning
If you configured certificate pinning in settings, only connections matching the pinned public key are allowed.
Fix: Remove or update the pinned key if the server certificate has changed.
HTTP-to-HTTPS upgrade
Crawlio automatically upgrades HTTP URLs to HTTPS when HSTS headers are present. If a site has broken HTTPS, this can cause failures.
Fix: Disable HSTS enforcement in settings if the site does not support HTTPS properly.
Site downloads but pages are blank
This means the site is a JavaScript-rendered SPA (React, Next.js, Vue, etc.). Crawlio downloads the HTML, but the content is rendered by JavaScript at runtime.
Fix: Enable WebKit mode in Settings. This renders JavaScript before saving the page.
Check the framework detection field in the status response:
curl --unix-socket ~/Library/Logs/Crawlio/control.sock http://localhost/statusIf it shows react, nextjs, vue, or similar, WebKit mode will help.
App won't open (Gatekeeper)
macOS may block Crawlio because it is distributed outside the App Store.
- Open System Settings > Privacy & Security
- Scroll down to find "Crawlio was blocked"
- Click Open Anyway
- You only need to do this once
CLI not found after Homebrew install
If crawlio is not found after brew install crawlio-app/tap/crawlio:
# Check installation
brew list crawlio
# Make sure Homebrew bin is in your PATH (Apple Silicon)
echo 'eval "$(/opt/homebrew/bin/brew shellenv)"' >> ~/.zshrc
source ~/.zshrcGetting help
If your issue is not covered here:
- Check the HTTP API for programmatic debugging
- Check File Locations to find log and state files
- Report a bug
- Request a feature
- Join Discord for real-time support
Next steps
- See Architecture to understand the crawl pipeline
- See Keyboard Shortcuts for fast navigation
- See MCP Tools for the full tool reference