docs/vendor-code-intelligence.md
This document defines the crates-focused equivalent of opensrc for Flow vendoring.
Keep Cargo as resolver/build authority while adding very fast local search across:
src, crates, scripts, docs, tests),lib/vendor/*),lib/vendor-manifest/*.toml + vendor.lock.toml).This gives AI and humans a fast map of "what code do we own right now?" without remote lookups.
Vendoring gives full control, but it increases local code volume. Without a fast index, trim/rewrite/sync loops become slower.
Typesense indexing solves that by keeping an always-queryable local code/search layer.
Start local Typesense (shared launcher in ~/code/infra/base):
f vendor-typesense-setup # one-time install in flox if needed
f vendor-typesense-up
f vendor-typesense-status
Build index:
f vendor-code-index
Search code chunks:
f vendor-code-search "Router"
f vendor-code-search "serde_json" --scope vendor --crate reqwest
f vendor-code-search "spawn" --scope firstparty --lang rs
Search source inventory:
f vendor-code-search-sources "ratatui"
f vendor-code-search-sources "github.com" --limit 50
Inspect raw inventory:
f vendor-code-sources
The index script (scripts/vendor/typesense_code_index.py) writes:
.vendor/typesense/sources.json:
<prefix>_sources collection:
<prefix>_chunks collection:
Default prefix is flow_code, so the default collections are:
flow_code_sourcesflow_code_chunksvendor.lock.toml is the canonical vendored crate set.
lib/vendor-manifest/*.toml is the per-crate provenance and sync state.
The index script reads both, so every inhouse/sync/hydrate cycle can be followed by:
f vendor-code-index
This keeps search aligned with the exact pinned state currently compiled by Cargo.
vendor-control.sh, scripts/vendor/sync-*).f vendor-code-index).f vendor-code-search ...).cargo check, vendoring verify gates).flow-vendor, pin updates in flow.--dry-run for large experiments before writing collections.--max-files for quick smoke indexing in CI/debug runs.