CrawlioCrawlio Docs

First Crawl

Your first download

  1. Launch Crawlio. Open the app. You will see an empty project window with a URL input field at the top.
  2. Paste a URL. Type or paste any website URL (for example, https://example.com). Crawlio adds https:// automatically if you type a bare domain.
  3. Hit Start. Click the download button or press Cmd+Return.
  4. Watch the waterfall. The waterfall view shows every file being downloaded in real time, color-coded by content type (HTML, CSS, JS, images, fonts, media).
  5. Browse offline. When the crawl finishes, open the downloaded folder and browse the site locally.

Crawlio rewrites all links so the site works offline. Stylesheets, images, fonts, and internal links all point to local files.

Default destination

Downloads save to ~/Downloads/Crawlio/ by default. Change this in Settings (Cmd+,) under the General tab, or per-project in the project settings panel.

What gets downloaded

Crawlio follows links and downloads all linked resources:

Content type Examples
HTML pages Every page reachable from your starting URL
Stylesheets CSS files including @import chains and url() references
Images JPEG, PNG, GIF, WebP, SVG, ICO, AVIF, BMP, TIFF
Fonts WOFF, WOFF2, TTF, OTF, EOT
Scripts JavaScript files referenced in HTML
Media MP4, WebM, MP3, WAV, OGG, and other audio/video
Documents PDFs (with link extraction), XML, JSON
Other Favicons, manifests, robots.txt, sitemaps

12 specialized parsers handle URL discovery across HTML, CSS, SVG, PDF, JavaScript, sitemaps, manifests, and more.

Crawl settings

Open Settings (Cmd+,) to configure how Crawlio crawls. Key settings:

Setting Default Description
Max Depth 5 How many links deep to follow
Concurrent Downloads 4 Parallel connections (1 to 40)
Crawl Delay 0.5s Pause between requests per host
Scope Same Domain Stay on domain, allow subdomains, or custom list
Respect robots.txt On Honor site crawl rules
Cross-Domain Assets On Download CSS, JS, fonts, images from external domains
Max File Size 50 MB Skip files larger than this
Max Total Size 500 MB Stop when total download reaches this limit
💡

For large sites, start with a lower max depth (3 to 5) to test your settings before doing a full crawl.

Export your download

When a crawl completes, export your archive in 7 formats:

Format Best for
Folder Default. Offline browsing with rewritten links
ZIP Sharing compressed archives
Single HTML One-file page snapshots
WARC ISO 28500 web archives (Wayback Machine compatible)
PDF Full-page PDF rendering via WebKit
Extracted Clean text and Markdown for AI pipelines
Deploy Deploy-ready static site with manifest and sitemap

See Export Formats for details on each format.

Using the CLI

Start a crawl from the terminal:

crawlio crawl start https://example.com --depth 5 --scope same-domain

The CLI connects to the running Crawlio.app and gives you terminal control over crawls. See CLI Overview for the full command reference.

Next steps

© 2026 Crawlio. All rights reserved.