I have done it.
I am not sure, if i didn’t miss something, but i upgraded test cluster from
CentOs7.7.1908+Ceph14.2.8 to Debian10.3+Ceph15.2.0.
Preparations:
- 6 nodes with OS CentOs7.7.1908, Ceph14.2.8:
- cephtest01,cephtest02,cephtest03: mon+mgr+mds+rgw
- cephtest04,cephtest05,cephtest06: osd disks (for test purposes only one disk by node)
- default pools for rgw
- replicated pool for metadata and erasured pool for data For CephFs
0) Set flag noout
# ceph osd set noout
1) I made a backup of admin keyring and config from /etc/ceph/ (it must be equal on all
nodes)
2) I stopped all services on cephtest03 and remembered their names
ceph-mon@cephtest03
ceph-mds@cephtest03
ceph-mgr@cephtest03
ceph-radosgw(a)rgw.cephtest03
3) Then made a full backup of /var/lib/ceph/
4) A new OS (Debian 10) is installed with the same host name and ip
5) New packages for Ceph15.2.0 installed
6) I copied admin keyring and config back to /etc/ceph/
7) Returned everything to /var/lib/ceph/
8) The monitor seems to need rights, so I did
# chown -R ceph:ceph /var/lib/ceph/mon
9) Then I turned on the same services that were before in 2)
# systemctl enable ceph-mon@cephtest03
# systemctl enable ceph-mds@cephtest03
# systemctl enable ceph-mgr@cephtest03
# systemctl enable ceph-radosgw(a)rgw.cephtest03
10) Started only the monitor
# systemctl start ceph-mon@cephtest03
11) Wait until a quorum is formed
12) Started other services
# systemctl start ceph-mds@cephtest03
# systemctl start ceph-mgr@cephtest03
# systemctl start ceph-radosgw(a)rgw.cephtest03
13) Repeat steps 2-12 with cephtest02 and cephtest01
14) Now the next part - osd. Started with cephtest04
15) It seems that I only need to remember names such as
ceph-volume@lvm-{osd_id}-{lv_name}
ceph-osd@{osd_id}
16) I stopped all services and made sure that I have all pgs in the active state+clean
(and some of them in the active+degraded)
17) I saved only keyring from /var/lib/ceph/bootstrap-osd/
18) steps 4-6
19) Returned bootstrap-osd keyring from 17) back to /var/lib/ceph/bootstrap-osd/
20) Then I turned on the same services that were before at 15) and started ceph-volume
# systemctl enable ceph-volume@lvm-{osd_id}-{lv_name}
# systemctl enable --runtime ceph-osd@{osd_id}
# systemctl start ceph-volume@lvm-{osd_id}-{lv_name}
23) Repeat steps 15-22 with cephtest05 and cephtest06
24) Unset flag noout
# ceph osd unset noout
I got an upgraded cluster with working dashboard and HEALTH_OK :)
And I want to ask only one question: am I missing something or not, because it seems
pretty easy to switch from one hosting OS to another?