docs/RELEASE_NOTES_v0.8.0.md
Release Date: January 2026 Previous Version: v0.7.6 Status: Release Candidate
What changed: Hooks are now disabled by default on the Docker API.
Why: Security fix for Remote Code Execution (RCE) vulnerability.
Who is affected: Users of the Docker API who use the hooks parameter in /crawl requests.
Migration:
# To re-enable hooks (only if you trust all API users):
export CRAWL4AI_HOOKS_ENABLED=true
What changed: The endpoints /execute_js, /screenshot, /pdf, and /html now reject file:// URLs.
Why: Security fix for Local File Inclusion (LFI) vulnerability.
Who is affected: Users who were reading local files via the Docker API.
Migration: Use the Python library directly for local file processing:
# Instead of API call with file:// URL, use library:
from crawl4ai import AsyncWebCrawler
async with AsyncWebCrawler() as crawler:
result = await crawler.arun(url="file:///path/to/file.html")
Severity: CRITICAL (CVSS 10.0)
Affected: Docker API deployment (all versions before v0.8.0)
Vector: POST /crawl with malicious hooks parameter
Details: The __import__ builtin was available in hook code, allowing attackers to import os, subprocess, etc. and execute arbitrary commands.
Fix:
__import__ from allowed builtinsCRAWL4AI_HOOKS_ENABLED=false)Severity: HIGH (CVSS 8.6)
Affected: Docker API deployment (all versions before v0.8.0)
Vector: POST /execute_js (and other endpoints) with file:///etc/passwd
Details: API endpoints accepted file:// URLs, allowing attackers to read arbitrary files from the server.
Fix: URL scheme validation now only allows http://, https://, and raw: URLs.
Discovered by Neo by ProjectDiscovery (projectdiscovery.io) - December 2025
Pre-page-load JavaScript injection for stealth evasions.
config = BrowserConfig(
init_scripts=[
"Object.defineProperty(navigator, 'webdriver', {get: () => false})"
]
)
ws://, wss://)cdp_cleanup_on_close=TrueAll deep crawl strategies (BFS, DFS, Best-First) now support crash recovery:
from crawl4ai.deep_crawling import BFSDeepCrawlStrategy
strategy = BFSDeepCrawlStrategy(
max_depth=3,
resume_state=saved_state, # Resume from checkpoint
on_state_change=save_callback # Persist state in real-time
)
Generate PDFs and MHTML from cached HTML content.
Render cached HTML and capture screenshots.
Proper URL resolution for raw: HTML processing:
config = CrawlerRunConfig(base_url='https://example.com')
result = await crawler.arun(url='raw:{html}', config=config)
Fast link extraction without full page processing:
config = CrawlerRunConfig(prefetch=True)
Enhanced proxy rotation with sticky sessions support.
Non-browser crawler now supports proxies.
New process_in_browser parameter for browser operations on local content:
config = CrawlerRunConfig(
process_in_browser=True, # Force browser processing
screenshot=True
)
result = await crawler.arun(url='raw:<html>...</html>', config=config)
Intelligent cache invalidation for sitemaps:
config = SeedingConfig(
cache_ttl_hours=24,
validate_sitemap_lastmod=True
)
Problem: CSS color codes like #eee were being truncated.
Before: raw:body{background:#eee} → body{background:
After: raw:body{background:#eee} → body{background:#eee}
Various fixes to cache validation and persistence.
Update the package:
pip install --upgrade crawl4ai
Docker API users:
export CRAWL4AI_HOOKS_ENABLED=truefile:// URLs no longer work on API (use library directly)Review security settings:
# config.yml - recommended for production
security:
enabled: true
jwt_enabled: true
Test your integration before deploying to production
hooks parameter in API callsfile:// URLs via the APISee CHANGELOG.md for complete version history.
Thanks to all contributors who made this release possible.
Special thanks to Neo by ProjectDiscovery for responsible security disclosure.
For questions or issues, please open a GitHub Issue.