I tried upgrading my home cluster to 15.2.7 (from 15.2.5) today and it appears to be
entering a loop when trying to match docker images for ceph:v15.2.7:
2020-12-01T16:47:26.761950-0700 mgr.aladdin.liknom [INF] Upgrade: Checking mgr daemons...
2020-12-01T16:47:26.769581-0700 mgr.aladdin.liknom [INF] Upgrade: All mgr daemons are up
to date.
2020-12-01T16:47:26.770096-0700 mgr.aladdin.liknom [INF] Upgrade: Checking mon daemons...
2020-12-01T16:47:28.800426-0700 mgr.aladdin.liknom [INF] Upgrade: All mon daemons are up
to date.
2020-12-01T16:47:28.800878-0700 mgr.aladdin.liknom [INF] Upgrade: Checking crash
daemons...
2020-12-01T16:47:28.851819-0700 mgr.aladdin.liknom [INF] Upgrade: Setting container_image
for all crash...
2020-12-01T16:47:28.855595-0700 mgr.aladdin.liknom [INF] Upgrade: All crash daemons are up
to date.
2020-12-01T16:47:28.856283-0700 mgr.aladdin.liknom [INF] Upgrade: Checking osd daemons...
2020-12-01T16:47:31.348345-0700 mgr.aladdin.liknom [INF] Upgrade: Pulling
docker.io/ceph/ceph:v15.2.7 on mandalaybay
2020-12-01T16:47:35.311065-0700 mgr.aladdin.liknom [INF] Upgrade: image
docker.io/ceph/ceph:v15.2.7 pull on mandalaybay got new image
9a0677fecc08d155a8e643b37c6e97d45c04747d9cb9455cafe0a7590d00b959 (not
2bc420ddb175bd1cf9031387948a8812d1bda9ef1180e429b4704e3c06bb943e), restarting
2020-12-01T16:47:35.534893-0700 mgr.aladdin.liknom [INF] Upgrade: Target is
docker.io/ceph/ceph:v15.2.7 with id
9a0677fecc08d155a8e643b37c6e97d45c04747d9cb9455cafe0a7590d00b959
2020-12-01T16:47:35.546444-0700 mgr.aladdin.liknom [INF] Upgrade: Checking mgr daemons...
2020-12-01T16:47:35.547185-0700 mgr.aladdin.liknom [INF] Upgrade: Need to upgrade myself
(mgr.aladdin.liknom)
2020-12-01T16:47:37.506337-0700 mgr.aladdin.liknom [INF] Upgrade: Pulling
docker.io/ceph/ceph:v15.2.7 on ether
2020-12-01T16:47:40.770290-0700 mgr.aladdin.liknom [INF] Upgrade: image
docker.io/ceph/ceph:v15.2.7 pull on ether got new image
2bc420ddb175bd1cf9031387948a8812d1bda9ef1180e429b4704e3c06bb943e (not
9a0677fecc08d155a8e643b37c6e97d45c04747d9cb9455cafe0a7590d00b959), restarting
2020-12-01T16:47:41.172402-0700 mgr.aladdin.liknom [INF] Upgrade: Target is
docker.io/ceph/ceph:v15.2.7 with id
2bc420ddb175bd1cf9031387948a8812d1bda9ef1180e429b4704e3c06bb943e
2020-12-01T16:47:41.226550-0700 mgr.aladdin.liknom [INF] Upgrade: Checking mgr daemons...
2020-12-01T16:47:41.230932-0700 mgr.aladdin.liknom [INF] Upgrade: All mgr daemons are up
to date.
2020-12-01T16:47:41.231887-0700 mgr.aladdin.liknom [INF] Upgrade: Checking mon daemons...
2020-12-01T16:47:43.179844-0700 mgr.aladdin.liknom [INF] Upgrade: All mon daemons are up
to date.
2020-12-01T16:47:43.180305-0700 mgr.aladdin.liknom [INF] Upgrade: Checking crash
daemons...
2020-12-01T16:47:43.187481-0700 mgr.aladdin.liknom [INF] Upgrade: Setting container_image
for all crash...
2020-12-01T16:47:43.191821-0700 mgr.aladdin.liknom [INF] Upgrade: All crash daemons are up
to date.
2020-12-01T16:47:43.192290-0700 mgr.aladdin.liknom [INF] Upgrade: Checking osd daemons...
2020-12-01T16:47:45.692126-0700 mgr.aladdin.liknom [INF] Upgrade: Pulling
docker.io/ceph/ceph:v15.2.7 on mandalaybay
2020-12-01T16:47:50.679789-0700 mgr.aladdin.liknom [INF] Upgrade: image
docker.io/ceph/ceph:v15.2.7 pull on mandalaybay got new image
9a0677fecc08d155a8e643b37c6e97d45c04747d9cb9455cafe0a7590d00b959 (not
2bc420ddb175bd1cf9031387948a8812d1bda9ef1180e429b4704e3c06bb943e), restarting
The machines 'ether' and 'aladdin' are x86_64 machines, but
'mandalaybay' is a raspberry pi 4 (arm64).
Is there a way to bypass this check to allow me to finish upgrading the cluster?
Thanks,
Bryan
Show replies by date
I think you should open an issue on the ceph tracker as it seems the cephadm upgrade
workflow doesn't support multi arch container images.
docker.io/ceph/ceph:v15.2.7 is a manifest list [1], which depending on the host
architecture (x86_64 or ARMv8), will provide you the right container image.
docker.io/ceph/ceph manifest references docker.io/ceph/ceph-amd64 and
docker.io/ceph/ceph-arm64 container images.
So it's expected to have the container image ID 2bc420ddb175 on your x86_64 host and
9a0677fecc08 on ARMv8 host but cephadm doesn't take care of this configuration as the
container image ID is compared between two hosts with a different arch [2].
[1]
https://hub.docker.com/r/ceph/ceph/tags?page=1&ordering=last_updated&am…
[2]
https://github.com/ceph/ceph/blob/master/src/pybind/mgr/cephadm/upgrade.py#…