eden/.llms/rules/ACR_repeated_large_traversal.md
Severity: CRITICAL
let changes: Vec<_> = cs.file_changes().collect() followed by multiple for change in &changes or .iter().filter(...) passes that could be mergedfor cs_id in commit_range { derive::<RootManifestId>(ctx, repo, cs_id).await? } -- sequential manifest derivation instead of batch.file_changes() or .changed_files() on the same bonsai changeset multiple timesBAD (multiple passes over file changes):
let changes: Vec<_> = bonsai.file_changes().collect();
// Pass 1: check for large files
for (path, change) in &changes {
if let Some(fc) = change.simplify() {
check_size(fc.size())?;
}
}
// Pass 2: check for restricted paths
for (path, change) in &changes {
check_restricted_path(path)?;
}
GOOD (single pass):
for (path, change) in bonsai.file_changes() {
check_restricted_path(path)?;
if let Some(fc) = change.simplify() {
check_size(fc.size())?;
}
}
BAD (sequential manifest derivation -- matches real performance issues):
for cs_id in &commit_ids {
let mf_id = repo
.derive::<RootManifestId>(ctx, *cs_id)
.await?;
process_manifest(mf_id).await?;
}
GOOD (batch derivation):
let mf_ids = repo
.derive_batch::<RootManifestId>(ctx, commit_ids.clone())
.await?;
for mf_id in mf_ids {
process_manifest(mf_id).await?;
}
Or for linear stacks:
derive_manifests_for_simple_stack_of_commits(ctx, repo, commit_ids).await?;
BAD (re-fetching in inner loop):
for cs_id in &ancestors {
let bonsai = cs_id.load(ctx, repo.repo_blobstore()).await?;
for (path, _change) in bonsai.file_changes() {
// Fetches the same manifest for cs_id again inside the loop
let mf = repo.derive::<RootManifestId>(ctx, *cs_id).await?;
let entry = mf.find_entry(ctx, repo.repo_blobstore(), path.clone()).await?;
process(entry)?;
}
}
GOOD (fetch once, reuse):
for cs_id in &ancestors {
let bonsai = cs_id.load(ctx, repo.repo_blobstore()).await?;
let mf = repo.derive::<RootManifestId>(ctx, *cs_id).await?;
let paths: Vec<_> = bonsai.file_changes().map(|(p, _)| p.clone()).collect();
let entries = mf.find_entries(ctx, repo.repo_blobstore(), paths).await?;
for entry in entries {
process(entry)?;
}
}
Merge multiple passes over the same collection into a single loop. Use derive_batch or derive_manifests_for_simple_stack_of_commits instead of deriving manifests one at a time in a loop. Hoist data fetches out of inner loops when they depend only on outer-loop variables. Use find_entries for batch path lookup instead of calling find_entry in a loop.