Back to Crawl4ai

Cli

docs/codebase/cli.md

0.8.64.0 KB
Original Source

cli.py command surface

CommandInputs / flagsWhat it does
profiles(none)Opens the interactive profile manager, lets you list, create, delete saved browser profiles that live in ~/.crawl4ai/profiles.
browser statusPrints whether the always-on builtin browser is running, shows its CDP URL, PID, start time.
browser stopKills the builtin browser and deletes its status file.
browser view--url, -u URL (optional)Pops a visible window of the builtin browser, navigates to URL or about:blank.
config listDumps every global setting, showing current value, default, and description.
config getkeyPrints the value of a single setting, falls back to default if unset.
config setkey valuePersists a new value in the global config (stored under ~/.crawl4ai/config.yml).
examplesJust spits out real-world CLI usage samples.
crawlurl (positional)
--browser-config,-B path
--crawler-config,-C path
--filter-config,-f path
--extraction-config,-e path
--json-extract,-j [desc]*
--schema,-s path
--browser,-b k=v list
--crawler,-c k=v list
--output,-o all,json,markdown,md,markdown-fit,md-fit (default all)
--output-file,-O path
--bypass-cache,-b (flag, default true — note flag reuse)
--question,-q str
--verbose,-v (flag)
--profile,-p profile-nameOne-shot crawl + extraction. Builds BrowserConfig and CrawlerRunConfig from inline flags or separate YAML/JSON files, runs AsyncWebCrawler.run(), can route through a named saved profile and pipe the result to stdout or a file.
(default)Same flags as crawl, plus --exampleShortcut so you can type just crwl https://site.com. When first arg is not a known sub-command, it falls through to crawl.

* --json-extract/-j with no value turns on LLM-based JSON extraction using an auto schema, supplying a string lets you prompt-engineer the field descriptions.

Quick mental model
profiles = manage identities,
browser ... = control long-running headless Chrome that all crawls can piggy-back on,
crawl = do the actual work,
config = tweak global defaults,
everything else is sugar.

Quick-fire “profile” usage cheatsheet

ScenarioCommand (copy-paste ready)Notes
Launch interactive Profile Manager UIcrwl profilesOpens TUI with options: 1 List, 2 Create, 3 Delete, 4 Use-to-crawl, 5 Exit.
Create a fresh profilecrwl profiles → choose 2 → name it → browser opens → log in → press q in terminalSaves to ~/.crawl4ai/profiles/<name>.
List saved profilescrwl profiles → choose 1Shows name, browser type, size, last-modified.
Delete a profilecrwl profiles → choose 3 → pick the profile index → confirmRemoves the folder.
Crawl with a profile (default alias)crwl https://site.com/dashboard -p my-profileKeeps login cookies, sets use_managed_browser=true under the hood.
Crawl + verbose JSON outputcrwl https://site.com -p my-profile -o json -vAny other crawl flags work the same.
Crawl with extra browser tweakscrwl https://site.com -p my-profile -b "headless=true,viewport_width=1680"CLI overrides go on top of the profile.
Same but via explicit sub-commandcrwl crawl https://site.com -p my-profileIdentical to default alias.
Use profile from inside Profile Managercrwl profiles → choose 4 → pick profile → enter URL → follow promptsHandy when demo-ing to non-CLI folks.
One-off crawl with a profile folder path (no name lookup)crwl https://site.com -b "user_data_dir=$HOME/.crawl4ai/profiles/my-profile,use_managed_browser=true"Bypasses registry, useful for CI scripts.
Launch a dev browser on CDP port with the same identitycrwl cdp -d $HOME/.crawl4ai/profiles/my-profile -P 9223Lets Puppeteer/Playwright attach for debugging.