Upgrading Ceph from Reef to Tentacle in a Rook-Managed Cluster
A real-world walkthrough of upgrading Ceph from v18 (Reef) through v19 (Squid) to v20 (Tentacle) via GitOps—including the correction of my wrong assumptions about Rook version constraints.
posted
2025·12·20
read
9 min
cat
storage
words
1,769
on this page
“You can’t skip Ceph versions. But you can be wrong about Rook constraints.”
The Situation
Reef (v18) end-of-life is August 2025. Tentacle (v20) shipped in November 2025. Time to upgrade.
I initially wrote a combined “Reef to Tentacle” upgrade guide, planning to do both hops in sequence. Then I read the Rook compatibility matrix and concluded I needed to wait for Rook v1.19 to get Tentacle support.
I was wrong.
The Rook Constraint (Corrected)
Here’s what I originally thought:
Rook Version
Supported Ceph Versions
v1.17.x
Reef only
v1.18.x
Reef + Squid
v1.19+
Squid + Tentacle (drops Reef)
The reality: Rook v1.18.8 added Tentacle support. You don’t need to wait for v1.19.
“If you’re running Ceph within Kubernetes using Rook Ceph, and you want to use Tentacle without the unsupported flag, you need to update at least to version v1.18.8.”
So the actual support matrix is:
Rook Version
Supported Ceph Versions
v1.18.0 - v1.18.7
Reef + Squid
v1.18.8+
Reef + Squid + Tentacle
The upgrade path still requires going through Squid—you can’t skip versions—but you can do both hops on Rook v1.18.8.
My Starting Point
1
2
3
4
5
$ kubectl -n rook-ceph exec deploy/rook-ceph-tools -- ceph version
ceph version 18.2.7 (2cf3b0098dc3cbb1b6f2e8d8ed9df8c65b6aee53) reef (stable)$ kubectl -n rook-ceph get deploy rook-ceph-operator -o jsonpath='{.spec.template.spec.containers[0].image}'ghcr.io/rook/ceph:v1.18.8
Reef v18.2.7 on Rook v1.18.8. Good to go for the full journey.
Phase 1: Reef to Squid
Step 1: Backup Everything
Ceph upgrades are one-way. Once you run require-osd-release squid, there’s no going back to Reef.
Without these flags, Ceph gets nervous when daemons restart and starts shuffling data around. That slows down the upgrade and adds risk.
Step 3: Update the HelmRelease
The actual change is one line:
1
2
3
4
5
# kubernetes/apps/rook-ceph/rook-ceph/cluster/helmrelease.yamlcephClusterSpec:cephVersion:image:quay.io/ceph/ceph:v19.2.3-20250717 # Was v18.2.7allowUnsupported:false
Note the build-specific tag (v19.2.3-20250717). Don’t use just v19.2.3—the build suffix ensures you get a specific, tested image rather than whatever “latest v19.2.3” happens to be.
Step 4: Deploy via GitOps
1
2
3
git add kubernetes/apps/rook-ceph/rook-ceph/cluster/helmrelease.yaml
git commit -m "feat(rook-ceph): upgrade Ceph from Reef v18.2.7 to Squid v19.2.3"git push
Flux picks up the change and triggers Rook to start the rolling upgrade.
Step 5: Watch the Rolling Upgrade
Rook upgrades daemons in a specific order: MON → MGR → MDS → OSD → RGW.
During the upgrade, you’ll see brief HEALTH_WARN states as daemons restart. This is normal. Only worry if you see HEALTH_ERR persisting for more than 10 minutes.
My 3-node cluster upgraded in about 3 minutes:
1
2
3
4
5
6
7
{"mon":{"ceph version 19.2.3 (...) squid (stable)":3},"mgr":{"ceph version 19.2.3 (...) squid (stable)":2},"osd":{"ceph version 19.2.3 (...) squid (stable)":3},"mds":{"ceph version 19.2.3 (...) squid (stable)":2},"rgw":{"ceph version 19.2.3 (...) squid (stable)":2}}
Step 6: Finalize the Squid Upgrade
This step is critical. It tells Ceph that all OSDs are now Squid-capable, enabling Squid-specific features:
kubectl -n rook-ceph exec deploy/rook-ceph-tools -- ceph osd set noout
kubectl -n rook-ceph exec deploy/rook-ceph-tools -- ceph osd set norebalance
Step 3: Update HelmRelease to Tentacle
1
2
3
4
5
# kubernetes/apps/rook-ceph/rook-ceph/cluster/helmrelease.yamlcephClusterSpec:cephVersion:image:quay.io/ceph/ceph:v20.2.0-20251104 # Was v19.2.3-20250717allowUnsupported:false
Step 4: Deploy via GitOps
1
2
3
git add kubernetes/apps/rook-ceph/rook-ceph/cluster/helmrelease.yaml
git commit -m "feat(rook-ceph): upgrade Ceph from Squid v19.2.3 to Tentacle v20.2.0"git push