.agents/skills/ci-status/SKILL.md
Provide a dashboard view of SkiaSharp CI health across main and recent release branches, with AI-powered analysis to identify patterns, regressions, and actionable fixes.
Unlike the release-status skill (which tracks a single release through the pipeline chain),
this skill gives a broad overview of CI health across multiple branches simultaneously.
The collector script requires:
az CLI — authenticated with access to xamarin/public and devdiv/DevDiv orgsgh CLI — authenticated with read access to mono/SkiaSharp and mono/SkiaSharp-API-docsgit branch -r returns up-to-date release branches| Pipeline Name | Org/Project | Definition ID | URL |
|---|---|---|---|
SkiaSharp (Public) | xamarin/public | 4 | link |
| Order | Pipeline Name | Definition ID | URL |
|---|---|---|---|
| 1 | SkiaSharp-Native | 26493 | link |
| 2 | SkiaSharp | 10789 | link |
| 3 | SkiaSharp-Tests | 15756 | link |
| Workflow | Repository | Trigger | Why Track |
|---|---|---|---|
| Docs - Deploy | mono/SkiaSharp | Push/PR to main | Docs site broken if failing |
| Docs - Go Live! | mono/SkiaSharp | Workflow dispatch | Docs don't publish if failing |
| Docs - PR Staging - Cleanup | mono/SkiaSharp | PR close events | Stale staging deploys accumulate |
| Docs - PR Staging - Sweep Stale | mono/SkiaSharp | Daily (06:00 UTC) | Stale staging deploys accumulate |
| Publish Samples | mono/SkiaSharp | Push/PR to samples/ | Sample projects broken if failing |
| API Diff | mono/SkiaSharp | Weekly (Sun 00:00 UTC) | API regression detection |
| Auto Docs Submodule Sync | mono/SkiaSharp | Daily (10:00 UTC) | API docs get out of sync |
| Update Release Notes | mono/SkiaSharp | Push to main/release/tags | Release notes stop auto-updating |
| Skia Upstream Sync | mono/SkiaSharp | Daily (07:00 UTC) | Upstream tracking breaks |
| Nightly Fix Finder | mono/SkiaSharp | Nightly | Nightly automation health |
| Auto-Triage SkiaSharp Issue | mono/SkiaSharp | Daily (04:05 UTC) + issue events | Triage automation stops |
| Persist Agentic Workflow Data | mono/SkiaSharp | Push to main | AI workflow data lost |
| Backport | mono/SkiaSharp | PR label/comment | Cherry-picks to release branches fail |
| Automatic Rebase | mono/SkiaSharp | PR comment | PR rebase automation broken |
| Add PR Artifacts Comment | mono/SkiaSharp | Workflow run events | Build links not posted to PRs |
| Auto API Docs Writer | mono/SkiaSharp-API-docs | Scheduled/dispatch | XML docs stop being written |
| Automerge Docs | mono/SkiaSharp-API-docs | PR events | Doc PRs won't auto-merge |
| Go Live | mono/SkiaSharp-API-docs | Workflow dispatch | Docs don't publish to live |
Run the collector script to gather build status, issues, and commit info:
python3 .agents/skills/ci-status/scripts/ci-status.py \
--json output/ai/ci-status-data.json
This produces ci-status-data.json containing raw pipeline runs, GitHub Actions statuses,
regression markers, issues/errors, and associated commits. The script also prints a console
summary for quick visual inspection.
| Flag | Default | Description |
|---|---|---|
--branches N | 3 | Number of most recent release/* branches to include |
--builds N | 5 | Number of recent builds to show per pipeline per branch |
--no-issues | off | Skip fetching errors/warnings (faster, less detail) |
--json PATH | none | Write raw structured JSON (for AI analysis) |
# Standard collection
python3 .agents/skills/ci-status/scripts/ci-status.py --json output/ai/ci-status-data.json
# Quick check (no timeline fetch)
python3 .agents/skills/ci-status/scripts/ci-status.py --no-issues
# Deep analysis window (10 builds, 5 branches)
python3 .agents/skills/ci-status/scripts/ci-status.py --branches 5 --builds 10 --json output/ai/ci-status-data.json
| File | Produced By | Consumed By |
|---|---|---|
output/ai/ci-status-data.json | Collector script (Step 1) | AI analysis only (Step 2) |
output/ai/ci-status-YYYY-MM-DD.json | AI (Step 3) | Validator + Renderers (Steps 4–5) |
⚠️ Never pass
ci-status-data.jsonto validate or render scripts. Those scripts expect the augmented report JSON that the AI assembles in Step 3.
After the script runs, read the JSON data (output/ai/ci-status-data.json) and perform the following analysis. All claims must reference actual build IDs and URLs from the data.
Classify overall health as one of:
Write 1-2 sentences: what's broken, since when, and the top action.
Group all errors/warnings across all branches and pipelines by normalized signature:
(task_name, normalized_first_error_line)code_regression, flake, infra_network, quota_resource, chain_blockage, unknown⚠️ Important: Finalize root-cause clusters only AFTER performing the Pipeline Chain Analysis (§2.3). Downstream cascade failures must be collapsed into the upstream root cause, not counted as independent clusters.
| Signal | Category |
|---|---|
Message contains network, timeout, EOF, connection, nuget.org, 429, 503 | infra_network |
Message contains No space left, OOM, killed, agent lost, pool | quota_resource |
| Same build passes/fails with no code change (pass/fail/pass pattern) | flake |
| Failure appears on multiple unrelated branches simultaneously | infra_network |
| Failure appears at a green→red transition on one branch only | code_regression |
| Downstream pipeline failed but upstream in same chain also failed | chain_blockage |
| None of the above | unknown |
The Internal chain is sequential: Native → Managed → Tests. A red pipeline is NOT automatically an independent failure — it is often a downstream casualty of an upstream break.
For every branch that has ≥1 red internal pipeline, you MUST:
Emit one explicit sentence per affected branch, even if the answer is "no cascade":
"release/X: 3 red internal pipelines — root-caused to {Native}; Managed+Tests were blocked downstream, not independently broken.""release/X: Native and Tests both red but with unrelated errors ({errA} vs {errB}) — two independent failures, not a cascade."⚠️ Do not skip this step or list internal pipelines as separate equal-weight failures without first stating the chain verdict. This is the most common analysis miss.
For each branch × pipeline, find green→red transitions (the script pre-computes these in regression fields):
changes — these are the regression suspectsLook for alternating pass/fail patterns within a single branch × pipeline:
| Pattern | Interpretation |
|---|---|
| Same error on ≥2 unrelated branches | Infrastructure issue (toolchain, agent pool, NuGet) |
| Error on exactly one branch | Code regression specific to that branch |
| Same error on release/* and main | Shared code issue (or infra) |
| Error only on release/X.Y.x but not release/X.Y.Z | Servicing-specific backport issue |
For each release/* branch:
ship, wait, cherry-pick, investigateFor each tracked GitHub Actions workflow:
GitHub Actions failures don't block releases directly (AzDO owns that), but they indicate broken automation that accumulates tech debt if ignored. Flag all failures as actionable — the goal is "what automation is failing" not just "can we release".
Provide at most 5 prioritized actions, ordered by:
Each recommendation should include:
After completing the analysis, assemble a single JSON file that combines the raw data with your analysis. This JSON is the source of truth for rendering.
Write the JSON to: output/ai/ci-status-YYYY-MM-DD.json
The schema is documented in references/report-schema.md. Top-level keys:
{
"meta": { "date", "timestamp", "schemaVersion": "1.0", "window", "branches" },
"verdict": { "status", "emoji", "summary" },
"azdoHealth": { "branches": [...], "regressions": [...] },
"chainAnalysis": [ { "branch", "verdict", "summary", "rootPipeline", "cascadedPipelines" } ],
"rootCauses": [ { "id", "title", "category", "severity", "footprint", "sampleError", ... } ],
"githubActions": { "workflows": [...], "summary": { "total", "healthy", "failing", ... } },
"flakes": [ { "branch", "pipeline", "pattern", "confidence", "description" } ],
"releaseRisk": [ { "branch", "shippable", "daysSinceGreen", "blockers", "recommendation" } ],
"recommendations": [ { "priority", "severity", "action", "reason", "target", "buildUrl" } ]
}
Rules for assembling the JSON:
azdoHealth section contains the raw pipeline data from ci-status-data.json, restructured to match the schemagithubActions section combines raw workflow data with your severity/status classificationsverdict, chainAnalysis, rootCauses, flakes, releaseRisk, recommendations) are your AI analysisbuildUrl and buildEvidence must reference real builds from the collected dataRun the validator to check the assembled JSON:
python3 .agents/skills/ci-status/scripts/validate-ci-status.py output/ai/ci-status-YYYY-MM-DD.json
If validation fails, fix the JSON and re-validate. Common issues:
Generate HTML and Markdown reports from the validated JSON:
# HTML dashboard (self-contained, opens in browser)
python3 .agents/skills/ci-status/scripts/render-ci-status.py output/ai/ci-status-YYYY-MM-DD.json
# Markdown report (for AI consumption and downstream agents)
python3 .agents/skills/ci-status/scripts/render-ci-status-md.py output/ai/ci-status-YYYY-MM-DD.json
This produces:
output/ai/ci-status-YYYY-MM-DD.html — Bootstrap 5 dashboard viewable in any browseroutput/ai/ci-status-YYYY-MM-DD.md — Comprehensive markdown for AI agentsAfter rendering, present a brief summary in chat and point to the files:
🟡 CI is degraded — release/3.119.x is blocked by a Guardian TSA upload failure; main is green.
📊 AzDO Health:
main ✅ Public | ✅ Native | ✅ Managed | ✅ Tests
release/4.147.0-preview.3 ❌ Public | ⚠️ Native | ✅ Managed | ✅ Tests
release/3.119.x ❌ Public | ⚠️ Native | ⚠️ Managed | ❌ Tests
🔗 Chain verdict:
release/3.119.x: Tests red independently (Guardian TSA); Native/Managed warnings only — no cascade.
release/4.147.0-preview.3: Public CI red (CS0016 errors); internal chain unaffected.
🐙 GitHub Actions:
🟠 High: Docs - Deploy ✅ | Publish Samples ✅ | Auto API Docs Writer ❌
🟡 Medium: Backport ✅ | Go Live ✅ | Auto-Triage ✅
⚪ Low: All passing
Top actions:
1. [release/3.119.x] Fix Guardian TSA upload — blocks Tests → build 14177772
2. [release/4.147.0-preview.3] Investigate CS0016 errors → build 157985
3. [GitHub Actions] Auto API Docs Writer failing → run 26651087356
📁 Reports:
JSON: output/ai/ci-status-2026-05-29.json
HTML: output/ai/ci-status-2026-05-29.html
MD: output/ai/ci-status-2026-05-29.md
If asked to dig deeper:
hlx-azdo_* tools to fetch specific build timelines, test results, or logshlx-azdo_build_analysis for known-issue matchinghlx-azdo_test_results to get specific failing test names| Question | Use |
|---|---|
| "Is main green?" | ci-status |
| "How's the release/3.119.4 build doing?" | release-status |
| "Daily CI check" | ci-status |
| "Are packages ready for release X?" | release-status |
| "Any CI failures across the board?" | ci-status |
| "What automation is failing?" | ci-status |
| "Trace the pipeline chain for branch X" | release-status |
| "Why is CI red?" | ci-status (with analysis) |
| "Is release/X shippable?" | ci-status (risk assessment) |
| "GitHub Actions status?" | ci-status |
This skill works well as a daily scheduled workflow:
Prompt: "Run the ci-status skill and report the health of main and recent release branches. Generate a full report with AI analysis."
Schedule: Daily at 9:00 AM