.agents/skills/ci-analysis/references/build-progression-analysis.md
When the current build is failing, the PR's build history can reveal whether the failure existed from the start or appeared after specific changes. This is a fact-gathering technique — like target-branch comparison — that provides context for understanding the current failure.
Don't try to analyze the full build history upfront — especially on large PRs with many pushes. Start with the most recent N builds (5-8), present the progression table, and let the user decide whether to dig deeper into earlier builds.
On large PRs, the user is usually iterating toward a solution. The recent builds are the most relevant. Offer: "Here are the last N builds — the pass→fail transition was between X and Y. Want me to look at earlier builds?"
gh pr checks only shows checks for the current HEAD SHA. To see the full build history, use AzDO or CLI:
With AzDO (preferred):
Query AzDO for builds on refs/pull/{PR}/merge branch, sorted by queue time descending, top 20, in the public project. The response includes triggerInfo with pr.sourceSha — the PR's HEAD commit for each build.
💡 Key parameters:
branchName: "refs/pull/{PR}/merge",queryOrder: "QueueTimeDescending",top: 20, projectpublic(for dnceng-public org).
Without MCP (fallback):
$org = "https://dev.azure.com/dnceng-public"
$project = "public"
az pipelines runs list --branch "refs/pull/{PR}/merge" --top 20 --org $org -p $project -o json
Each build's triggerInfo contains pr.sourceSha — the PR's HEAD commit when the build was triggered. Extract it from the build response or CLI output.
⚠️
sourceVersionis the merge commit, not the PR's head commit. UsetriggerInfo.'pr.sourceSha'instead.
⚠️ Target branch moves between builds. Each build merges
pr.sourceShainto the target branch HEAD at the time the build starts. Ifmainreceived new commits between build N and N+1, the two builds merged against different baselines — even ifpr.sourceShais the same. Always extract the target branch HEAD to detect baseline shifts.
Shortcut for the latest build — use the GitHub merge commit:
For the current/latest build, the merge ref (refs/pull/{PR}/merge) is available via the GitHub API. The merge commit's first parent is the target branch HEAD at the time GitHub computed the merge:
Look up the merge commit's parents — the first parent is the target branch HEAD. Use the GitHub API or MCP (get_commit with the sourceVersion SHA) to get the commit details. The sourceVersion from the AzDO build is the merge commit SHA (not pr.sourceSha). Example:
gh api repos/{owner}/{repo}/git/commits/{sourceVersion} --jq '.parents[0].sha'
This is simpler than parsing checkout logs.
⚠️ This only works for the latest build. GitHub recomputes
refs/pull/{PR}/mergeon each push, so the merge commit changes. For historical builds in a progression analysis, the merge ref no longer reflects what was built — use the checkout log method below.
For historical builds — extract from checkout logs:
The AzDO build API doesn't expose the target branch SHA. Extract it from the checkout task log.
With AzDO (preferred):
Fetch the checkout task log for the build — typically log ID 5, starting around line 500+ (skip the early git-fetch output). Search the output for the merge line:
HEAD is now at {mergeCommit} Merge {prSourceSha} into {targetBranchHead}
💡
logId: 5is the first checkout task in most dotnet pipelines. If it doesn't contain the merge line, check the build timeline for "Checkout" tasks to find the correct log ID.
Without MCP (fallback):
$token = az account get-access-token --resource "499b84ac-1321-427f-aa17-267ca6975798" --query accessToken -o tsv
$headers = @{ Authorization = "Bearer $token" }
$logUrl = "https://dev.azure.com/{org}/{project}/_apis/build/builds/{BUILD_ID}/logs/5"
$log = Invoke-RestMethod -Uri $logUrl -Headers $headers
Note: log ID 5 is the first checkout task in most pipelines. The merge line is typically around line 500-650. If log 5 doesn't contain it, check the build timeline for "Checkout" tasks.
Note: a PR may have more unique pr.sourceSha values than commits visible on GitHub, because force-pushes replace the commit history. Each force-push triggers a new build with a new merge commit and a new pr.sourceSha.
Use the SQL tool to track builds as you discover them. This avoids losing context and enables queries across the full history:
CREATE TABLE IF NOT EXISTS build_progression (
build_id INT PRIMARY KEY,
pr_sha TEXT,
target_sha TEXT,
result TEXT, -- passed, failed, canceled
queued_at TEXT,
failed_jobs TEXT, -- comma-separated job names
notes TEXT
);
Insert rows as you extract data from each build:
INSERT INTO build_progression VALUES
(1283986, '7af79ad', '2d638dc', 'failed', '2026-02-08T10:00:00Z', 'WasmBuildTests', 'Initial commits'),
(1284169, '28ec8a0', '0b691ba', 'failed', '2026-02-08T14:00:00Z', 'WasmBuildTests', 'Iteration 2'),
(1284433, '39dc0a6', '18a3069', 'passed', '2026-02-09T09:00:00Z', NULL, 'Iteration 3');
Then query to find the pass→fail transition:
-- Find where it went from passing to failing
SELECT * FROM build_progression ORDER BY queued_at;
-- Did the target branch move between pass and fail?
SELECT pr_sha, target_sha, result FROM build_progression
WHERE result IN ('passed', 'failed') ORDER BY queued_at;
-- Which builds share the same PR SHA? (force-push detection)
SELECT pr_sha, COUNT(*) as builds, GROUP_CONCAT(result) as results
FROM build_progression GROUP BY pr_sha HAVING builds > 1;
Present the table to the user:
| PR HEAD | Target HEAD | Builds | Result | Notes |
|---|---|---|---|---|
| 7af79ad | 2d638dc | 1283986 | ❌ | Initial commits |
| 28ec8a0 | 0b691ba | 1284169 | ❌ | Iteration 2 |
| 39dc0a6 | 18a3069 | 1284433 | ✅ | Iteration 3 |
| f186b93 | 5709f35 | 1286087 | ❌ | Added commit C; target moved ~35 commits |
| 2e74845 | 482d8f9 | 1286967 | ❌ | Modified commit C |
When both pr.sourceSha AND Target HEAD change between a pass→fail transition, either could be the cause. Analyze the failure content to determine which. If only the target moved (same pr.sourceSha), the failure came from the new baseline.
For deeper analysis, track which tests failed in each build:
CREATE TABLE IF NOT EXISTS build_failures (
build_id INT,
job_name TEXT,
test_name TEXT,
error_snippet TEXT,
helix_job TEXT,
work_item TEXT,
PRIMARY KEY (build_id, job_name, test_name)
);
Insert failures as you investigate each build, then query for patterns:
-- Tests that fail in every build (persistent, not flaky)
SELECT test_name, COUNT(DISTINCT build_id) as fail_count, GROUP_CONCAT(build_id) as builds
FROM build_failures GROUP BY test_name HAVING fail_count > 1;
-- New failures in the latest build (what changed?)
SELECT f.* FROM build_failures f
LEFT JOIN build_failures prev ON f.test_name = prev.test_name AND prev.build_id = {PREV_BUILD_ID}
WHERE f.build_id = {LATEST_BUILD_ID} AND prev.test_name IS NULL;
-- Flaky tests: fail in some builds, pass in others
SELECT test_name FROM build_failures GROUP BY test_name
HAVING COUNT(DISTINCT build_id) < (SELECT COUNT(*) FROM build_progression WHERE result = 'failed');
Report what the progression shows:
💡 Stop when you have the progression table and the pass→fail transition identified. The table + transition commits + error category is enough for the user to act. Don't investigate further (e.g., comparing individual commits, checking passing builds, exploring main branch history) unless the user asks.
Do not make fix recommendations based solely on build progression. The progression narrows the investigation — it doesn't determine the right fix. The human may have context about why changes were made, what constraints exist, or what the reviewer intended.
When the progression shows that a failure appeared after new commits, check whether those commits were review-requested:
# Get review comments with timestamps
gh api "repos/{OWNER}/{REPO}/pulls/{PR}/comments" `
--jq '.[] | {author: .user.login, body: .body, created: .created_at}'
Present this as additional context: "Commit C was pushed after reviewer X commented requesting Y." Let the author decide how to proceed.
Build progression identifies which change correlates with the current failure. Binlog comparison (see binlog-comparison.md) shows what's different in the build between a passing and failing state. Together they provide a complete picture:
Both techniques compare a failing build against a passing one:
| Technique | Passing build from | Answers |
|---|---|---|
| Target-branch comparison | Recent build on the base branch (e.g., main) | "Does this test pass without the PR's changes at all?" |
| Build progression | Earlier build on the same PR | "Did this test pass with the PR's earlier changes?" |
Use target-branch comparison first to confirm the failure is PR-related. Use build progression to narrow down which part of the PR introduced it. If build progression shows a pass→fail transition with the same pr.sourceSha, the target branch is the more likely culprit — use target-branch comparison to confirm.
❌ Don't treat build history as a substitute for analyzing the current build. The current build determines CI status. Build history is context for understanding and investigating the current failure.
❌ Don't make fix recommendations from progression alone. "Build N passed and build N+1 failed after adding commit C" is a fact worth reporting. "Therefore revert commit C" is a judgment that requires more context than the agent has — the commit may be addressing a critical review concern, fixing a different bug, or partially correct.
❌ Don't assume earlier passing builds prove the original approach was complete. A build may pass because it didn't change enough to trigger the failing test scenario. The reviewer who requested additional changes may have identified a real gap.
❌ Don't assume MSBuild changes only affect the platform you're looking at. MSBuild properties, conditions, and targets are shared infrastructure. A commit that changes a condition, moves a property, or modifies a restore flag can impact any platform that evaluates the same code path. When a commit touches MSBuild files, verify its impact across all platforms — don't assume it's scoped to the one you're investigating.