First Crawl

Your first download

Launch Crawlio. Open the app. You will see an empty project window with a URL input field at the top.
Paste a URL. Type or paste any website URL (for example, https://example.com). Crawlio adds https:// automatically if you type a bare domain.
Hit Start. Click the download button or press Cmd+Return.
Watch the waterfall. The waterfall view shows every file being downloaded in real time, color-coded by content type (HTML, CSS, JS, images, fonts, media).
Browse offline. When the crawl finishes, open the downloaded folder and browse the site locally.

Crawlio rewrites all links so the site works offline. Stylesheets, images, fonts, and internal links all point to local files.

Default destination

Downloads save to ~/Downloads/Crawlio/ by default. Change this in Settings (Cmd+,) under the General tab, or per-project in the project settings panel.

What gets downloaded

Crawlio follows links and downloads all linked resources:

Content type	Examples
HTML pages	Every page reachable from your starting URL
Stylesheets	CSS files including `@import` chains and `url()` references
Images	JPEG, PNG, GIF, WebP, SVG, ICO, AVIF, BMP, TIFF
Fonts	WOFF, WOFF2, TTF, OTF, EOT
Scripts	JavaScript files referenced in HTML
Media	MP4, WebM, MP3, WAV, OGG, and other audio/video
Documents	PDFs (with link extraction), XML, JSON
Other	Favicons, manifests, robots.txt, sitemaps

12 specialized parsers handle URL discovery across HTML, CSS, SVG, PDF, JavaScript, sitemaps, manifests, and more.

Crawl settings

Open Settings (Cmd+,) to configure how Crawlio crawls. Key settings:

Setting	Default	Description
Max Depth	5	How many links deep to follow
Concurrent Downloads	4	Parallel connections (1 to 40)
Crawl Delay	0.5s	Pause between requests per host
Scope	Same Domain	Stay on domain, allow subdomains, or custom list
Respect robots.txt	On	Honor site crawl rules
Cross-Domain Assets	On	Download CSS, JS, fonts, images from external domains
Max File Size	50 MB	Skip files larger than this
Max Total Size	500 MB	Stop when total download reaches this limit

💡

For large sites, start with a lower max depth (3 to 5) to test your settings before doing a full crawl.

Export your download

When a crawl completes, export your archive in 7 formats:

Format	Best for
Folder	Default. Offline browsing with rewritten links
ZIP	Sharing compressed archives
Single HTML	One-file page snapshots
WARC	ISO 28500 web archives (Wayback Machine compatible)
PDF	Full-page PDF rendering via WebKit
Extracted	Clean text and Markdown for AI pipelines
Deploy	Deploy-ready static site with manifest and sitemap

See Export Formats for details on each format.

Using the CLI

Start a crawl from the terminal:

crawlio crawl start https://example.com --depth 5 --scope same-domain

The CLI connects to the running Crawlio.app and gives you terminal control over crawls. See CLI Overview for the full command reference.

Next steps

Export Formats: All 7 export formats in detail
Connect AI: Set up MCP for AI-driven crawls
Common Workflows: Practical recipes for common tasks
Settings Reference: All configuration options

PreviousInstall NextConnect AI