Back to Crawl4ai

Code Examples

docs/md_v2/core/examples.md

0.8.612.8 KB
Original Source

Code Examples

This page provides a comprehensive list of example scripts that demonstrate various features and capabilities of Crawl4AI. Each example is designed to showcase specific functionality, making it easier for you to understand how to implement these features in your own projects.

Getting Started Examples

ExampleDescriptionLink
Hello WorldA simple introductory example demonstrating basic usage of AsyncWebCrawler with JavaScript execution and content filtering.View Code
QuickstartA comprehensive collection of examples showcasing various features including basic crawling, content cleaning, link analysis, JavaScript execution, CSS selectors, media handling, custom hooks, proxy configuration, screenshots, and multiple extraction strategies.View Code
Quickstart Set 1Basic examples for getting started with Crawl4AI.View Code
Quickstart Set 2More advanced examples for working with Crawl4AI.View Code

Proxies

ExampleDescriptionLink
NSTProxyNSTProxy Seamlessly integrates with crawl4ai — no setup required. Access high-performance residential, datacenter, ISP, and IPv6 proxies with smart rotation and anti-blocking technology. Starts from $0.1/GB. Use code crawl4ai for 10% off.View Code

Browser & Crawling Features

ExampleDescriptionLink
Built-in BrowserDemonstrates how to use the built-in browser capabilities.View Code
Browser OptimizationFocuses on browser performance optimization techniques.View Code
arun vs arun_manyCompares the arun and arun_many methods for single vs. multiple URL crawling.View Code
Multiple URLsShows how to crawl multiple URLs asynchronously.View Code
Page InteractionGuide on interacting with dynamic elements through clicks.View Guide
Crawler MonitorShows how to monitor the crawler's activities and status.View Code
Full Page Screenshot & PDFGuide on capturing full-page screenshots and PDFs from massive webpages.View Guide

Advanced Crawling & Deep Crawling

ExampleDescriptionLink
Deep CrawlingAn extensive tutorial on deep crawling capabilities, demonstrating BFS and BestFirst strategies, stream vs. non-stream execution, filters, scorers, and advanced configurations.View Code
Virtual ScrollComprehensive examples for handling virtualized scrolling on sites like Twitter, Instagram. Demonstrates different scrolling scenarios with local test server.View Code
Adaptive CrawlingDemonstrates intelligent crawling that automatically determines when sufficient information has been gathered.View Code
DispatcherShows how to use the crawl dispatcher for advanced workload management.View Code
Storage StateTutorial on managing browser storage state for persistence.View Guide
Network Console CaptureDemonstrates how to capture and analyze network requests and console logs.View Code

Extraction Strategies

ExampleDescriptionLink
Extraction StrategiesDemonstrates different extraction strategies with various input formats (markdown, HTML, fit_markdown) and JSON-based extractors (CSS and XPath).View Code
Scraping StrategiesCompares the performance of different scraping strategies.View Code
LLM ExtractionDemonstrates LLM-based extraction specifically for OpenAI pricing data.View Code
LLM MarkdownShows how to use LLMs to generate markdown from crawled content.View Code
Summarize PageShows how to summarize web page content.View Code

E-commerce & Specialized Crawling

ExampleDescriptionLink
Amazon Product ExtractionDemonstrates how to extract structured product data from Amazon search results using CSS selectors.View Code
Amazon with HooksShows how to use hooks with Amazon product extraction.View Code
Amazon with JavaScriptDemonstrates using custom JavaScript for Amazon product extraction.View Code
Crypto AnalysisDemonstrates how to crawl and analyze cryptocurrency data.View Code
SERP APIDemonstrates using Crawl4AI with search engine result pages.View Code

Anti-Bot & Stealth Features

ExampleDescriptionLink
Stealth Mode Quick StartFive practical examples showing how to use stealth mode for bypassing basic bot detection.View Code
Stealth Mode ComprehensiveComprehensive demonstration of stealth mode features with bot detection testing and comparisons.View Code
Undetected BrowserSimple example showing how to use the undetected browser adapter.View Code
Undetected Browser DemoBasic demo comparing regular and undetected browser modes.View Code
Undetected TestsAdvanced tests comparing regular vs undetected browsers on various bot detection services.View Folder
CapSolver Captcha SolverSeamlessly integrate with CapSolver to automatically solve reCAPTCHA v2/v3, Cloudflare Turnstile / Challenges, AWS WAF and more for uninterrupted scraping and automation.View Folder

Customization & Security

ExampleDescriptionLink
HooksIllustrates how to use hooks at different stages of the crawling process for advanced customization.View Code
Identity-Based BrowsingIllustrates identity-based browsing configurations for authentic browsing experiences.View Code
Proxy RotationShows how to use proxy rotation for web scraping and avoiding IP blocks.View Code
SSL CertificateIllustrates SSL certificate handling and verification.View Code
Language SupportShows how to handle different languages during crawling.View Code
GeolocationDemonstrates how to use geolocation features.View Code

Docker & Deployment

ExampleDescriptionLink
Docker ConfigDemonstrates how to create and use Docker configuration objects.View Code
Docker BasicA test suite for Docker deployment, showcasing various functionalities through the Docker API.View Code
Docker REST APIShows how to interact with Crawl4AI Docker using REST API calls.View Code
Docker SDKDemonstrates using the Python SDK for Crawl4AI Docker.View Code

Application Examples

ExampleDescriptionLink
Research AssistantDemonstrates how to build a research assistant using Crawl4AI.View Code
REST CallShows how to make REST API calls with Crawl4AI.View Code
Chainlit IntegrationShows how to integrate Crawl4AI with Chainlit.View Guide
Crawl4AI vs FireCrawlCompares Crawl4AI with the FireCrawl library.View Code

Content Generation & Markdown

ExampleDescriptionLink
Content SourceDemonstrates how to work with different content sources in markdown generation.View Code
Content Source (Short)A simplified version of content source usage.View Code
Built-in Browser GuideGuide for using the built-in browser capabilities.View Guide

Running the Examples

To run any of these examples, you'll need to have Crawl4AI installed:

bash
pip install crawl4ai

Then, you can run an example script like this:

bash
python -m docs.examples.hello_world

For examples that require additional dependencies or environment variables, refer to the comments at the top of each file.

Some examples may require:

  • API keys (for LLM-based examples)
  • Docker setup (for Docker-related examples)
  • Additional dependencies (specified in the example files)

Contributing New Examples

If you've created an interesting example that demonstrates a unique use case or feature of Crawl4AI, we encourage you to contribute it to our examples collection. Please see our contribution guidelines for more information.