Documentation/Upgrade/ceph-upgrade.md
This guide will walk through the steps to upgrade the version of Ceph in a Rook cluster. Rook and Ceph upgrades are designed to ensure data remains available even while the upgrade is proceeding. Rook will perform the upgrades in a rolling fashion such that application pods are not disrupted.
Rook is cautious when performing upgrades. When an upgrade is requested (the Ceph image has been updated in the CR), Rook will go through all the daemons one by one and will individually perform checks on them. It will make sure a particular daemon can be stopped before performing the upgrade. Once the deployment has been updated, it checks if this is ok to continue. After each daemon is updated we wait for things to settle (monitors to be in a quorum, PGs to be clean for OSDs, up for MDSes, etc.), then only when the condition is met we move to the next daemon. We repeat this process until all the daemons have been updated.
The following Ceph versions are supported in this release of Rook:
csi.readAffinity.enabled: true.
If read affinity is enabled in your cluster, upgrade to v20.2.1+.
If read affinity is enabled with v20.2.0, Rook automatically disables read affinity internally to help avoid this corruption.
After the upgrade, existing applications must be restarted to fully disable the read affinity in existing volumes.
See this issue for more details.!!! important
When an update is requested, the operator will check Ceph's status,
if it is in HEALTH_ERR the operator will refuse to proceed with the upgrade.
Official Ceph container images can be found on Quay.
These images are tagged in a few ways:
v19.2.3-20250717).
These tags are recommended for production clusters, as there is no possibility for the cluster to
be heterogeneous with respect to the version of Ceph running in containers.v19) are useful for development and test clusters so that the
latest version of Ceph is always available.Ceph containers other than the official images from the registry above will not be supported.
The upgrade will be automated by the Rook operator after the desired Ceph image is changed in the
CephCluster CRD (spec.cephVersion.image).
ROOK_CLUSTER_NAMESPACE=rook-ceph
NEW_CEPH_IMAGE='quay.io/ceph/ceph:v19.2.3-20250717'
kubectl -n $ROOK_CLUSTER_NAMESPACE patch CephCluster $ROOK_CLUSTER_NAMESPACE --type=merge -p "{\"spec\": {\"cephVersion\": {\"image\": \"$NEW_CEPH_IMAGE\"}}}"
Since the Rook toolbox is not controlled by
the Rook operator, users must perform a manual upgrade by modifying the image to match the ceph version
employed by the new Rook operator release. Employing an outdated Ceph version within the toolbox may result
in unexpected behaviour.
kubectl -n rook-ceph set image deploy/rook-ceph-tools rook-ceph-tools=quay.io/ceph/ceph:v19.2.3-20250717
As with upgrading Rook, now wait for the upgrade to complete. Status can be determined in a similar way to the Rook upgrade as well.
watch --exec kubectl -n $ROOK_CLUSTER_NAMESPACE get deployments -l rook_cluster=$ROOK_CLUSTER_NAMESPACE -o jsonpath='{range .items[*]}{.metadata.name}{" \treq/upd/avl: "}{.spec.replicas}{"/"}{.status.updatedReplicas}{"/"}{.status.readyReplicas}{" \tceph-version="}{.metadata.labels.ceph-version}{"\n"}{end}'
Confirm the upgrade is completed when the versions are all on the desired Ceph version.
kubectl -n $ROOK_CLUSTER_NAMESPACE get deployment -l rook_cluster=$ROOK_CLUSTER_NAMESPACE -o jsonpath='{range .items[*]}{"ceph-version="}{.metadata.labels.ceph-version}{"\n"}{end}' | sort | uniq
This cluster is not yet finished:
ceph-version=v18.2.4-0
ceph-version=v19.2.3-0
This cluster is finished:
ceph-version=v19.2.3-0
Verify the Ceph cluster's health using the health verification.