.agents/skills/prepare-providers-documentation/SKILL.md
This skill replaces the manual commit-by-commit classification step that the
release manager normally performs when running
breeze release-management prepare-provider-documentation. Instead of asking
the release manager to type d/b/f/x/m/s/v for each commit, the
skill drives the classification itself — inspecting every PR (with extra care
for potentially breaking changes), scoping multi-provider PRs to the slice
that touched the current provider, and asking the release manager only when
genuinely uncertain.
The skill keeps the existing breeze tooling as the source of truth for
template generation. Claude only owns the classification + version bump +
changelog entries; everything else (__init__.py, README.rst,
pyproject.toml, conf.py, get_provider_info.py, index.rst) is still
regenerated by breeze release-management prepare-provider-documentation --reapply-templates-only.
[!IMPORTANT] This is a release-manager workflow. It mutates
provider.yamlandchangelog.rstfor many providers in one pass. Always run on a clean working tree (or in a dedicated branch) and let the release manager review the diff before committing.
Use during the regular provider release cycle, in place of either of:
breeze release-management prepare-provider-documentation
breeze release-management prepare-provider-documentation --incremental-update
…when the release manager wants Claude to classify the changes instead of doing it by hand. The skill covers the same scope: classifying changes, bumping versions, generating changelog sections, reapplying templates, and folding new commits into an already-prepared release PR (incremental update).
Two entry points:
main since the changelog was first generated (typical when
rebasing a release PR before merging). Skip ahead to the
Incremental Update section after Phase 5.Do not use this skill for:
--only-min-version-update runs (these don't need classification — just
run breeze directly).main base branch unless you also pass the right
--base-branch to the breeze invocations described below.Ask the release manager (and confirm by reading the answers back) for:
RELEASE_DATE in YYYY-MM-DD (or YYYY-MM-DD_NN) format, e.g.
2026-04-26. This is what breeze stamps into
providers/.last_release_date.txt.main. Only override when releasing from a
provider-specific branch (e.g. provider-cncf-kubernetes/v4-4).DISTRIBUTIONS_LIST), use that list.--include-not-ready-providers
and/or --include-removed-providers.Set the environment for the session:
export RELEASE_DATE=<date>
# Optional, scopes everything to a subset
export DISTRIBUTIONS_LIST="<provider1> <provider2> ..."
Make sure the apache-https-for-providers git remote exists and is up to
date — running breeze the first time below will recreate and fetch it.
The skill runs in five phases. Mark tasks with TaskCreate for each phase
and tick them off as you go — the release manager wants to see progress.
For each provider, the source of truth for "what changed since last release"
is the same git query breeze uses internally: commits between the latest
release tag for that provider (providers-<id>/<version>) and
apache-https-for-providers/<base-branch>, restricted to the provider's own
folders.
Discover in batch by running:
breeze release-management prepare-provider-documentation \
--non-interactive \
--skip-changelog \
--skip-readme \
--release-date "$RELEASE_DATE"
[!WARNING] Do not commit the result of that command.
--non-interactiveanswers the classification prompts with random values — Claude will overwrite the changelog and version bumps in Phase 4 with real classifications. The only reason to run breeze first is to refresh the apache remote, regenerate build files, and confirm which providers have pending changes (read the "Summary of prepared documentation" block at the end).
Record from the summary:
Reset the per-provider files that breeze touched but you'll be rewriting yourself before continuing:
git checkout -- $(git diff --name-only -- '**/provider.yaml' '**/changelog.rst')
This leaves the regenerated build files (__init__.py, README.rst,
pyproject.toml, conf.py, get_provider_info.py, index.rst) in place
and discards only the stuff Claude is about to rewrite.
For each provider in Success from Phase 1, get the same commit list that breeze would have shown. From the repo root:
PROVIDER_ID=<dotted.id> # e.g. amazon, cncf.kubernetes
PROVIDER_PATH=$(echo "$PROVIDER_ID" | tr '.' '/') # folder path: cncf/kubernetes
PROVIDER_TAG=$(echo "$PROVIDER_ID" | tr '.' '-') # tag segment: cncf-kubernetes
# Pick the latest *final* release tag. Two gotchas the tag pattern must handle:
# * dotted provider ids use HYPHENS in tag names (providers-cncf-kubernetes/<ver>),
# even though the source folder uses slashes — build the tag prefix from
# PROVIDER_TAG, not PROVIDER_PATH;
# * skip the sentinel upper-bound tags (providers-<id>/99.98.0, /99.99.0) and rc
# tags — git's default version sort orders "1.2.0rc1" AFTER "1.2.0", so a bare
# `head -n1` would otherwise select a sentinel or a release candidate.
LAST_TAG=$(git tag --list "providers-${PROVIDER_TAG}/*" --sort=-v:refname \
| grep -vE '/99\.9[0-9]\.' | grep -vE 'rc[0-9]+$' | head -n1)
git log --pretty=format:'%H %h %cd %s' --date=short \
"${LAST_TAG}..apache-https-for-providers/main" \
-- "providers/${PROVIDER_PATH}/"
[!WARNING] This git query is a convenience for building the per-provider commit list, but the authoritative set is what breeze prints in the Phase 1 "Commit" tables for each provider. The tag-based range can still diverge from breeze when a provider's most recent final tag is not the last actually-published release (for example, a wave commit bumped the version on
mainbut the published baseline is older), which makes breeze include repo-wide commits this query misses. When the two disagree, trust breeze's list and reconcile against it before classifying.
Capture the full hash, short hash, date, subject, and #NNNN PR number for
each commit. Note that some old providers also have legacy paths under
airflow/providers/<id>/ — include those when present (consult
provider_details.possible_old_provider_paths semantics by checking the
provider's provider.yaml history if needed).
For each commit, classify it into one of:
| Code | Meaning | Version bump |
|---|---|---|
d | Documentation-only | none (patch if combined) |
b | Bug fix | patch |
f | Feature | minor |
x | Breaking change | major |
m | Misc (deps, refactors, internal only) | patch |
s | Skip (test/CI/example only — no user impact) | none |
v | Min Airflow version bump | minor (treated as misc + bump) |
Before spawning a sub-agent, apply the same fast heuristics breeze uses
(see classify_provider_pr_files in
dev/breeze/src/airflow_breeze/prepare_providers/provider_documentation.py):
providers/<id>/docs/**/*.rst → d (docs).providers/<id>/tests/** or
providers/<id>/src/airflow/providers/<id>/example_dags/** → s (skip).Bump minimum Airflow version and only __init__.py /
provider.yaml changed → v.Note these classifications and move on — no sub-agent needed.
For the remaining commits, spawn sub-agents in parallel (batches of 5–10 to
avoid context pressure). Use the Explore agent type — they need read-only
access. Brief each sub-agent with:
Classify a single Apache Airflow provider PR.
PR: #<NNNN>
Commit: <full-hash>
Subject: <subject>
Provider: <provider-id> (path: providers/<provider-path>/)
Tasks:
1. Read the PR's title, body, and labels:
`gh pr view <NNNN> --json title,body,labels,files`
2. Read the diff for the slice of the PR that touched
providers/<provider-path>/ only:
`gh pr diff <NNNN> -- 'providers/<provider-path>/**'`
(When the PR touches multiple providers, you only care about the slice
for THIS provider — ignore the others when classifying.)
3. Decide a single classification:
- documentation: only docs/comments/typos in the provider slice
- bugfix: fixes incorrect behavior, no API changes
- feature: adds new capability, parameter, operator, sensor, hook,
or extends an existing one in a backwards-compatible way
- breaking: see "Breaking-change checklist" below
- misc: dependency bumps, internal refactors, packaging-only
changes, type-hint cleanups, no user-visible behavior
- skip: only tests/examples/CI for this provider's slice
- min_airflow_bump: explicitly bumps the minimum Airflow version pin
4. Output strictly:
CLASSIFICATION: <one of: documentation|bugfix|feature|breaking|misc|skip|min_airflow_bump>
CONFIDENCE: <high|medium|low>
JUSTIFICATION: <one sentence>
BREAKING_RISK: <none|maybe|yes> (set "maybe" when the diff has any
signal from the breaking-change
checklist, even if you think the
author intended otherwise)
Breaking-change checklist (any of these → BREAKING_RISK >= maybe; usually
breaking unless clearly behind a deprecation shim):
* Public class/function/method removed or renamed
in the **public interface** of the provider — i.e. files under
`providers/<path>/src/**/{hooks,operators,sensors,triggers,
notifications,decorators,executors}/**`, the provider's
top-level package `__init__.py`, plus anything imported by
`provider.yaml` (`hook-class-names`, `extra-links`, etc.).
Internal helpers (e.g. `utils/`, `_internal/`, `pod_manager.py`,
or any module not re-exported from the package or referenced
in `provider.yaml`) are NOT breaking on their own. NOT in tests/.
* Required parameter added to a public constructor or operator __init__
* Default value of a public parameter changed
* Return type or signature of a public method changed
* `extra_dejson` / connection-form fields removed or renamed
* Behavior change in `execute()`, `poke()`, `get_conn()` that produces
different results for the same inputs
* Minimum Python or Airflow version bumped (separate: that's
min_airflow_bump unless the bump excludes a previously supported version
of a provider's hard dependency, in which case it's also breaking)
* Removed deprecation: a previously-deprecated symbol is now deleted
* Schema change in stored data (xcom, connection, asset metadata,
or the serialized state/context of a `BaseTrigger` subclass —
deferred tasks survive provider upgrades only if the trigger's
`serialize()` payload stays compatible)
Do NOT trust the PR title alone — read the diff. A PR titled "Refactor X"
that removes a public method is breaking. A PR titled "BREAKING: rename
foo" that only renames a private symbol is not.
Collect all sub-agent results into a table.
Print a per-provider summary in this exact format (so the release manager can scan it quickly):
Provider: amazon
Current version: 9.12.0
Most-impactful change: feature → next version: 9.13.0
Commits (12):
abc1234 d high docs: fix S3 example #65000
def5678 b high Fix retry on transient SQS error #65010
9ab0123 f high Add wait_for_completion to AthenaOperator #65020
4cd5678 x med Remove deprecated S3Hook.list_objects #65030 ⚠ BREAKING
7ef9012 m high Bump aiobotocore to 2.13 #65040
...
Uncertain: 2 commits below — please confirm:
4cd5678 x med Remove deprecated S3Hook.list_objects (#65030)
Why: list_objects is documented as deprecated since 8.0.0 but never
raised DeprecationWarning, so removal may surprise users.
abc4321 ? low "Refactor Athena client" (#65060)
Why: PR description says non-breaking but diff changes the default
region resolution from env to provider extras.
Always escalate to the release manager when:
CONFIDENCE: low from any sub-agent.BREAKING_RISK: maybe but the sub-agent classified as anything other than
breaking.breaking (major bump): always reconfirm
explicitly before applying. Major bumps are never silent.If the release manager corrects a classification, save it in your classification table and re-derive the most-impactful change.
For each provider, in order:
provider.yamlOpen providers/<provider-path>/provider.yaml, find the versions: block,
and prepend the new version. The bump rule (most-impactful classification
across all commits for this provider, computed in Phase 3.5):
| Most-impactful | Bump |
|---|---|
breaking | major (X+1.0.0) |
feature | minor (X.Y+1.0) |
min_airflow_bump | minor (X.Y+1.0) |
bugfix | patch (X.Y.Z+1) |
misc | patch (X.Y.Z+1) |
documentation only | no bump — handle as doc-only (see below) |
skip only | no bump — nothing to do |
Also update source-date-epoch: to the current int(time.time()).
For doc-only providers, do not bump the version. Instead, write the
latest commit hash from the doc-only batch into
providers/<provider-path>/docs/.latest-doc-only-change.txt (newline
terminated). This is what breeze checks on the next release to know the
provider hasn't really changed.
Open providers/<provider-path>/docs/changelog.rst. Insert a new section
above the most recent existing version section. The exact format must
match dev/breeze/src/airflow_breeze/templates/CHANGELOG_TEMPLATE.rst.jinja2
— don't paraphrase it. The skeleton:
<NEW_VERSION>
<dots matching length of NEW_VERSION>
.. note::
This release of provider is only available for Airflow X.Y+ as explained in the
Apache Airflow providers support policy <https://github.com/apache/airflow/blob/main/PROVIDERS.rst#minimum-supported-version-of-airflow-for-community-managed-providers>_.
Breaking changes
~~~~~~~~~~~~~~~~
* ``<commit subject for breaking change> (#NNNN)``
Features
~~~~~~~~
* ``<commit subject for feature> (#NNNN)``
Bug Fixes
~~~~~~~~~
* ``<commit subject for bugfix> (#NNNN)``
Misc
~~~~
* ``<commit subject for misc/min_airflow_bump> (#NNNN)``
Doc-only
~~~~~~~~
* ``<commit subject for doc> (#NNNN)``
.. Below changes are excluded from the changelog. Move them to
appropriate section above if needed. Do not delete the lines(!):
* ``<commit subject for skip> (#NNNN)``
Rules:
.. note:: block only when the version bump was driven by
a min_airflow_bump (or by a breaking whose breaking aspect is the
Airflow min bump).Breaking changes
section if there were none — don't leave an empty header)... Below changes are excluded ... block at the end is required even
if empty. Lines under it use the indented ` * ``...``` form (three-space
indent, double backticks).message_without_backticks). Don't paraphrase.(#NNNN) PR suffix.Once all providers have their provider.yaml and changelog.rst
updated, run:
breeze release-management prepare-provider-documentation \
--reapply-templates-only \
--skip-git-fetch \
--release-date "$RELEASE_DATE"
This regenerates __init__.py, README.rst, pyproject.toml, conf.py,
get_provider_info.py, and index.rst for every provider — picking up the
new versions you just wrote. It will not touch changelog.rst.
[!NOTE]
commits.rstper provider is also stable template content (the actual commit list is rendered at doc-build time via theairflow-providers-commitsdirective). It will be regenerated on the next full release. No action needed here.
Run the same checks the release manager would run:
# RST lint + license headers + ruff on Python files
prek run --from-ref main --hook-stage pre-commit
# Spot-check that provider.yaml versions parse
breeze release-management prepare-provider-documentation \
--reapply-templates-only --skip-git-fetch \
--release-date "$RELEASE_DATE" # idempotent — should be a no-op diff
Then git diff --stat and walk the release manager through the diff
provider-by-provider:
provider.yaml matches the bump rule.changelog.rst has the right sections populated.Stop here. Do not commit, do not push — the release manager opens the PR
themselves following the regular release workflow in
dev/README_RELEASE_PROVIDERS.md.
Use this flow when the release PR has already been opened (changelog and
version bumps applied via Phases 1–5) and the release manager rebases it
to pick up commits that landed on main after the original classification.
This is the equivalent of breeze release-management prepare-provider-documentation --incremental-update, but driven by the
same AI classification logic as the initial run.
[!IMPORTANT] Run on the release PR branch after rebasing onto the latest base branch. Do not start the incremental flow on a clean checkout — it needs the prior classifications already written into
changelog.rstto diff against.
breeze release-management prepare-provider-documentation \
--reapply-templates-only \
--release-date "$RELEASE_DATE"
This re-fetches apache-https-for-providers/<base-branch> and regenerates
the auto-generated build files for every provider — picking up any
upstream template changes that landed since the original PR was opened.
It does not touch provider.yaml or changelog.rst.
For each provider that already has a new version section in its
changelog.rst (the providers in the release PR), get the current commit
list the same way as Phase 2 of the initial run:
PROVIDER_ID=<dotted.id>
PROVIDER_PATH=$(echo "$PROVIDER_ID" | tr '.' '/') # folder path (slashes)
PROVIDER_TAG=$(echo "$PROVIDER_ID" | tr '.' '-') # tag segment (hyphens)
# Same tag-selection rules as Phase 2: hyphenated tag segment, and skip sentinel
# (99.98.0/99.99.0) and rc tags so we compare against the last *final* release.
LAST_TAG=$(git tag --list "providers-${PROVIDER_TAG}/*" --sort=-v:refname \
| grep -vE '/99\.9[0-9]\.' | grep -vE 'rc[0-9]+$' | head -n1)
git log --pretty=format:'%H %h %cd %s' --date=short \
"${LAST_TAG}..apache-https-for-providers/main" \
-- "providers/${PROVIDER_PATH}/"
Then identify new commits by comparing PR numbers to the existing
changelog. A commit is "new" if its (#NNNN) PR suffix is not already
present anywhere in providers/${PROVIDER_PATH}/docs/changelog.rst. This
is exactly the same predicate breeze uses internally (see
_generate_new_changelog append branch in
dev/breeze/src/airflow_breeze/prepare_providers/provider_documentation.py).
CHANGELOG="providers/${PROVIDER_PATH}/docs/changelog.rst"
# pseudo: emit only commits whose #NNNN is NOT in the changelog
git log --pretty=format:'%H %h %cd %s' --date=short \
"${LAST_TAG}..apache-https-for-providers/main" \
-- "providers/${PROVIDER_PATH}/" \
| python3 -c "
import re, sys
seen = open('${CHANGELOG}').read()
for line in sys.stdin:
m = re.search(r'\(#(\d+)\)', line)
if not m or f'(#{m.group(1)})' not in seen:
print(line, end='')
"
If there are zero new commits for a provider, skip it.
Same logic as Phase 3 of the initial run — including the auto-classify heuristic for docs/test-only changes and the sub-agent-per-PR pattern with the breaking-change checklist. The output is a per-provider table mapping each new commit hash to a classification.
Compute the most-impactful classification across both the existing
classified commits in the changelog and the new ones. If the most
impactful is now stronger than what's already in provider.yaml, the
version needs to be re-bumped. The escalation table:
| Was bumped to | Now most-impactful is | Action |
|---|---|---|
| patch | feature | re-bump to next minor (X.Y+1.0) |
| patch | min_airflow_bump | re-bump to next minor (X.Y+1.0) |
| patch / minor | breaking | re-bump to next major (X+1.0.0) |
| minor | feature | no change — already minor |
| anything | bugfix or misc | no change |
A re-bump means: replace the prepended version in provider.yaml AND
update the version header in changelog.rst's new section to match.
Always confirm a re-bump with the release manager — explicitly state the old version, the new version, and which incoming commit forced the escalation. Don't silently re-bump.
For each new commit, insert into the existing latest-version section of
changelog.rst under the right header:
| Classification | Section |
|---|---|
breaking | Breaking changes |
feature | Features |
bugfix | Bug Fixes |
misc | Misc |
min_airflow_bump | Misc |
documentation | Doc-only |
skip | excluded block at end |
If the section header doesn't exist yet (e.g. previously there were no
breaking changes, but a new commit introduced one), create the header
above the next existing section, matching the order in
CHANGELOG_TEMPLATE.rst.jinja2:
Breaking changes → Features → Bug Fixes → Misc → Doc-only.
If you re-bumped the version in Incremental Phase 3.5, also add or remove the
.. note:: block about the Airflow min version requirement to match the
new bump kind.
Same as Phase 5 of the initial run plus an extra check: confirm there are
no leftover "Please review …" markers from a prior interactive
breeze release-management prepare-provider-documentation --incremental-update run. If any are present (someone ran the breeze
incremental flow before invoking this skill), remove them as part of the
final pass. Then walk the diff with the release manager.
When a single PR touches several providers (e.g.
Add Python 3.14 Support (#63520) touches dozens), classify it
independently per provider. The same PR can be feature in one provider
(a real new capability) and misc in another (just a constraint bump in
pyproject.toml). Always scope the sub-agent's diff inspection to the
current provider's path:
gh pr diff <NNNN> -- 'providers/<provider-path>/**'
If the per-provider classifications come back different, do NOT try to "reconcile" them — that's a feature, not a bug. The release manager wants each provider's changelog to reflect what changed in that provider.
When you ask, state your best guess and the alternative explicitly:
Provider
amazon, commit4cd5678("Remove deprecatedS3Hook.list_objects" #65030): I classified this as breaking because the symbol is removed from the public API inproviders/amazon/src/airflow/providers/amazon/aws/hooks/s3.py, even though the PR description says "deprecated since 8.0.0". Confirm breaking (major bump 9.x → 10.0.0) or override to misc (patch)?
Don't ask vague yes/no questions ("is this breaking?"); always offer the two alternatives with the version-bump consequence.
? and ask.commits.rst, index.rst, __init__.py, README.rst,
pyproject.toml, conf.py, get_provider_info.py directly. Those are
template-generated by breeze.git add or git commit — the release manager owns the PR.If the per-provider commit count is huge (50+) and the sub-agents come
back with low confidence on most of them (typically because the diffs
require deep domain knowledge), tell the release manager you're stopping
the AI classification and recommend they run the regular interactive
breeze release-management prepare-provider-documentation for that
specific provider. Don't try to power through guesswork — the wrong
classification at major-bump granularity is worse than a slower manual run.
dev/breeze/src/airflow_breeze/prepare_providers/provider_documentation.py
— the breeze module this skill replaces (classification + changelog
generation). Read this when in doubt about format.dev/breeze/src/airflow_breeze/templates/CHANGELOG_TEMPLATE.rst.jinja2
— exact format for the changelog section you write in Phase 4b.dev/README_RELEASE_PROVIDERS.md §"Convert commits to changelog entries
and bump provider versions" — the human workflow this skill automates.PROVIDERS.rst §"Upgrading minimum supported version of Airflow" —
policy for min_airflow_bump classifications.