CrawlioCrawlio Docs

What is Crawlio?

Overview

Crawlio is a macOS website downloader built with Swift 6 and SwiftUI. It crawls, downloads, and packages entire websites for offline browsing, analysis, or deployment. Every linked resource (HTML pages, stylesheets, scripts, images, videos, fonts, PDFs) is downloaded and all links are rewritten to work locally.

Crawlio uses structured concurrency for all concurrent operations, respects robots.txt by default, and ships with AI integration that goes beyond simple mirroring.

Key capabilities

Full-site download. Crawl an entire website with configurable depth (up to 100 levels), concurrency (1 to 40 parallel connections), and scope rules. 12 specialized parsers extract URLs from HTML, CSS, SVG, PDF, JavaScript, sitemaps, manifests, and more. All links are rewritten for offline browsing.

19 site analyzers. SEO, accessibility, security headers, best practices, content quality, design system, hreflang, images, keywords, link intelligence, orphan pages, redirect chains, social meta, tracking surfaces, URL hygiene, parity (raw vs. rendered), and duplicate content detection.

7 export formats. Folder (offline browsing), ZIP (sharing), Single HTML (one-file snapshots), WARC (ISO 28500 web archives with SHA1 digests, CDX indexing, deduplication, per-record gzip, and file splitting), PDF (WebKit rendering), Extracted (clean text and Markdown for AI), and Deploy (static site with manifest and sitemap).

AI enrichment. Framework detection (59 technologies across 4 detection layers), browser runtime capture (network requests, console logs, DOM snapshots, screenshots), and Vision OCR on downloaded images. All enrichment data is stored per-URL and flows into exports.

MCP server. 49 full-mode tools and 6 code-mode tools for crawl control, monitoring, export, and enrichment. Connect to Claude Code, Claude Desktop, ChatGPT, Cursor, Windsurf, VS Code, and 10+ other clients via the init wizard.

Browser agent. Chrome extension with 96 CDP-powered tools exposed via 3 code-mode tools. Captures runtime signals that static parsing cannot detect: JS framework hydration, network waterfalls, console errors, live DOM.

CLI. 22 command families (~45 leaf commands) including interactive REPL shell and autonomous crawl loop with AI-guided strategy. Connects to the running app, a headless engine, or reads files directly through a 4-tier resolution chain.

HTTP API. 45 REST endpoints on a local Unix Domain Socket for programmatic control from any language or tool.

Platform. macOS 13+ (Ventura and later). iOS 17+ for the mobile companion app.

How to use these docs

Section What you'll find
Getting Started Installation, first crawl, AI setup, changelog
Guides Export formats, common workflows, settings, framework detection, AI enrichment
MCP Server Setup, code mode, method mode, evidence mode, JIT context, tool reference
Browser Agent Chrome extension setup, MCP server install, browser tools reference
AI Skills Bundled skills that teach AI assistants how to crawl and analyze sites
CLI Command reference, interactive shell, autonomous loop
Reference HTTP API, architecture, file locations, troubleshooting, keyboard shortcuts

Start exploring

Next steps

© 2026 Crawlio. All rights reserved.