June 2020 - ceph-users - lists.ceph.io

Re: crashing OSD: ceph_assert(is_valid_io(off, len))

by Harald Staub

(really sorry for spamming, but it is still waiting for moderator, so trying with xz ...) On 08.06.20 17:21, Harald Staub wrote: > (and now with trimmed attachment because of size restriction: only the > debug log) > > On 08.06.20 16:53, Harald Staub wrote: >> (and now with attachment ...) >> >> On 08.06.20 16:51, Harald Staub wrote: >>> Hi Igor >>> >>> Thank you for looking into this! I attached the complete log of >>> today, with the preceding "ceph_assert(h->file->fnode.ino != 1)" at >>> 13:13:22.609, the first "FAILED ceph_assert(is_valid_io(off, len))" >>> at 13:44:52.059, the debug log starting at 16:42:20.883. >>> >>> Cheers >>> Harry >>> >>> On 08.06.20 16:37, Igor Fedotov wrote: >>>> Hi Harald, >>>> >>>> was this exact OSD suffering from "ceph_assert(h->file->fnode.ino != >>>> 1)"? >>>> >>>> Could you please collect extended log with debug-bluefs set ot 20? >>>> >>>> >>>> Thanks, >>>> >>>> Igor >>>> >>>> On 6/8/2020 4:48 PM, Harald Staub wrote: >>>>> This is again about our bad cluster, with far too many objects. Now >>>>> another OSD crashes immediately at startup: >>>>> >>>>> /build/ceph-14.2.8/src/os/bluestore/KernelDevice.cc: 944: FAILED >>>>> ceph_assert(is_valid_io(off, len)) >>>>> 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char >>>>> const*)+0x152) [0x5601938e0e92] >>>>> 2: (ceph::__ceph_assertf_fail(char const*, char const*, int, char >>>>> const*, char const*, ...)+0) [0x5601938e106d] >>>>> 3: (KernelDevice::read(unsigned long, unsigned long, >>>>> ceph::buffer::v14_2_0::list*, IOContext*, bool)+0x8e0) >>>>> [0x560193f2ae90] >>>>> 4: (BlueFS::_read(BlueFS::FileReader*, BlueFS::FileReaderBuffer*, >>>>> unsigned long, unsigned long, ceph::buffer::v14_2_0::list*, >>>>> char*)+0x4f6) [0x560193ee0506] >>>>> 5: (BlueFS::_replay(bool, bool)+0x489) [0x560193ee14e9] >>>>> 6: (BlueFS::mount()+0x219) [0x560193ef4319] >>>>> 7: (BlueStore::_open_bluefs(bool)+0x41) [0x560193de2281] >>>>> 8: (BlueStore::_open_db(bool, bool, bool)+0x88c) [0x560193de347c] >>>>> 9: (BlueStore::_open_db_and_around(bool)+0x44) [0x560193dfa134] >>>>> 10: (BlueStore::_mount(bool, bool)+0x584) [0x560193e4a804] >>>>> 11: (OSD::init()+0x3b7) [0x56019398f957] >>>>> 12: (main()+0x3cdb) [0x5601938e85cb] >>>>> 13: (__libc_start_main()+0xe7) [0x7f54cdf1bb97] >>>>> 14: (_start()+0x2a) [0x56019391b08a] >>>>> >>>>> 2020-06-08 13:44:52.063 7f54d169ec00 -1 *** Caught signal (Aborted) ** >>>>> in thread 7f54d169ec00 thread_name:ceph-osd >>>>> >>>>> ceph version 14.2.8 (2d095e947a02261ce61424021bb43bd3022d35cb) >>>>> nautilus (stable) >>>>> 1: (()+0x12890) [0x7f54cf286890] >>>>> 2: (gsignal()+0xc7) [0x7f54cdf38e97] >>>>> 3: (abort()+0x141) [0x7f54cdf3a801] >>>>> 4: (ceph::__ceph_assert_fail(char const*, char const*, int, char >>>>> const*)+0x1a3) [0x5601938e0ee3] >>>>> 5: (ceph::__ceph_assertf_fail(char const*, char const*, int, char >>>>> const*, char const*, ...)+0) [0x5601938e106d] >>>>> 6: (KernelDevice::read(unsigned long, unsigned long, >>>>> ceph::buffer::v14_2_0::list*, IOContext*, bool)+0x8e0) >>>>> [0x560193f2ae90] >>>>> 7: (BlueFS::_read(BlueFS::FileReader*, BlueFS::FileReaderBuffer*, >>>>> unsigned long, unsigned long, ceph::buffer::v14_2_0::list*, >>>>> char*)+0x4f6) [0x560193ee0506] >>>>> 8: (BlueFS::_replay(bool, bool)+0x489) [0x560193ee14e9] >>>>> 9: (BlueFS::mount()+0x219) [0x560193ef4319] >>>>> 10: (BlueStore::_open_bluefs(bool)+0x41) [0x560193de2281] >>>>> 11: (BlueStore::_open_db(bool, bool, bool)+0x88c) [0x560193de347c] >>>>> 12: (BlueStore::_open_db_and_around(bool)+0x44) [0x560193dfa134] >>>>> 13: (BlueStore::_mount(bool, bool)+0x584) [0x560193e4a804] >>>>> 14: (OSD::init()+0x3b7) [0x56019398f957] >>>>> 15: (main()+0x3cdb) [0x5601938e85cb] >>>>> 16: (__libc_start_main()+0xe7) [0x7f54cdf1bb97] >>>>> 17: (_start()+0x2a) [0x56019391b08a] >>>>> NOTE: a copy of the executable, or `objdump -rdS <executable>` is >>>>> needed to interpret this. >>>>> >>>>> There was a preceding assert (earlier written about): >>>>> >>>>> /build/ceph-14.2.8/src/os/bluestore/BlueFS.cc: 2261: FAILED >>>>> ceph_assert(h->file->fnode.ino != 1) >>>>> >>>>> Any ideas that I could try to save this OSDs? >>>>> >>>>> Cheers >>>>> Harry >>>>> _______________________________________________ >>>>> ceph-users mailing list -- ceph-users(a)ceph.io >>>>> To unsubscribe send an email to ceph-users-leave(a)ceph.io

3 years, 10 months

2
2
0 0

Broken PG in cephfs data_pool (lost objects)

by Francois Legrand

Hi all, We have a cephfs with data_pool in erasure coding (3+2) ans 1024 pg (nautilus 14.2.8). One of the pgs is partially destroyed (we lost 3 osd thus 3 shards), it have 143 objects unfound and is stuck in state "active+recovery_unfound+undersized+degraded+remapped". We then lost some datas (we are using cephfs-data-scan pg_files... to identify files with data on the bad pg) . We thus created a new filesystem (this time with data_pool in replica 3) and we are copying all the datas from the broken FS to the new one. But we need to remove files from the broken FS after copy to free space (because there will not be enough space on the cluster). To avoid problems of strays we removed the snapshots on the broken FS before deleting files. The point is that the mds managing the broken FS is now "Behind on trimming (123036/128) max_segments: 128, num_segments: 123036" and have 1 slow metadata IOs are blocked > 30 secs, oldest blocked for 83645 secs. The slow IO correspond to osd 27 which is acting_primary for the broken PG. and the broken pg have a long "snap_trimq": "[1e0c~1,1e0e~1,1e12~1,1e16~1,1e18~1,1e1a~1,........" and "snap_trimq_len": 460. It then seems that cephfs is not able to trim ops corresponding to the deletion of objects and snaps which have data on the broken PG, probably because the pg is not healty. Is there a way to tell ceph/cephfs to flush or forget about (only) lost objects on the broken pg and get this pg healty enough to perform trimming ? thanks for your help F.

3 years, 10 months

1
0
0 0

crashing OSD: ceph_assert(is_valid_io(off, len))

by Harald Staub

This is again about our bad cluster, with far too many objects. Now another OSD crashes immediately at startup: /build/ceph-14.2.8/src/os/bluestore/KernelDevice.cc: 944: FAILED ceph_assert(is_valid_io(off, len)) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x152) [0x5601938e0e92] 2: (ceph::__ceph_assertf_fail(char const*, char const*, int, char const*, char const*, ...)+0) [0x5601938e106d] 3: (KernelDevice::read(unsigned long, unsigned long, ceph::buffer::v14_2_0::list*, IOContext*, bool)+0x8e0) [0x560193f2ae90] 4: (BlueFS::_read(BlueFS::FileReader*, BlueFS::FileReaderBuffer*, unsigned long, unsigned long, ceph::buffer::v14_2_0::list*, char*)+0x4f6) [0x560193ee0506] 5: (BlueFS::_replay(bool, bool)+0x489) [0x560193ee14e9] 6: (BlueFS::mount()+0x219) [0x560193ef4319] 7: (BlueStore::_open_bluefs(bool)+0x41) [0x560193de2281] 8: (BlueStore::_open_db(bool, bool, bool)+0x88c) [0x560193de347c] 9: (BlueStore::_open_db_and_around(bool)+0x44) [0x560193dfa134] 10: (BlueStore::_mount(bool, bool)+0x584) [0x560193e4a804] 11: (OSD::init()+0x3b7) [0x56019398f957] 12: (main()+0x3cdb) [0x5601938e85cb] 13: (__libc_start_main()+0xe7) [0x7f54cdf1bb97] 14: (_start()+0x2a) [0x56019391b08a] 2020-06-08 13:44:52.063 7f54d169ec00 -1 *** Caught signal (Aborted) ** in thread 7f54d169ec00 thread_name:ceph-osd ceph version 14.2.8 (2d095e947a02261ce61424021bb43bd3022d35cb) nautilus (stable) 1: (()+0x12890) [0x7f54cf286890] 2: (gsignal()+0xc7) [0x7f54cdf38e97] 3: (abort()+0x141) [0x7f54cdf3a801] 4: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x1a3) [0x5601938e0ee3] 5: (ceph::__ceph_assertf_fail(char const*, char const*, int, char const*, char const*, ...)+0) [0x5601938e106d] 6: (KernelDevice::read(unsigned long, unsigned long, ceph::buffer::v14_2_0::list*, IOContext*, bool)+0x8e0) [0x560193f2ae90] 7: (BlueFS::_read(BlueFS::FileReader*, BlueFS::FileReaderBuffer*, unsigned long, unsigned long, ceph::buffer::v14_2_0::list*, char*)+0x4f6) [0x560193ee0506] 8: (BlueFS::_replay(bool, bool)+0x489) [0x560193ee14e9] 9: (BlueFS::mount()+0x219) [0x560193ef4319] 10: (BlueStore::_open_bluefs(bool)+0x41) [0x560193de2281] 11: (BlueStore::_open_db(bool, bool, bool)+0x88c) [0x560193de347c] 12: (BlueStore::_open_db_and_around(bool)+0x44) [0x560193dfa134] 13: (BlueStore::_mount(bool, bool)+0x584) [0x560193e4a804] 14: (OSD::init()+0x3b7) [0x56019398f957] 15: (main()+0x3cdb) [0x5601938e85cb] 16: (__libc_start_main()+0xe7) [0x7f54cdf1bb97] 17: (_start()+0x2a) [0x56019391b08a] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. There was a preceding assert (earlier written about): /build/ceph-14.2.8/src/os/bluestore/BlueFS.cc: 2261: FAILED ceph_assert(h->file->fnode.ino != 1) Any ideas that I could try to save this OSDs? Cheers Harry

3 years, 10 months

2
3
0 0

https://tracker.ceph.com/issues/45032

by biohazd＠yahoo.com

Is there any quick fix for the issue listed here - https://tracker.ceph.com/issues/45032 I have hit this on ceph upgrade to 15.2.3.

3 years, 10 months

1
0
0 0

dealing with spillovers

by thoralf schulze

hi there, trying to get around my head rocksdb spillovers and how to deal with them … in particular, i have one osds which does not have any pools associated (as per ceph pg ls-by-osd $osd ), yet it does show up in ceph health detail as: osd.$osd spilled over 2.9 MiB metadata from 'db' device (49 MiB used of 37 GiB) to slow device compaction doesn't help. i am well aware of https://tracker.ceph.com/issues/38745 , yet find it really counter-intuitive that an empty osd with a more-or-less optimal sized db volume can't fit its rockdb on the former. is there any way to repair this, apart from re-creating the osd? fwiw, dumping the database with ceph-kvstore-tool bluestore-kv /var/lib/ceph/osd/ceph-$osd dump > bluestore_kv.dump yields a file of less than 100mb in size. and, while we're at it, a few more related questions: - am i right to assume that the leveldb and rocksdb arguments to ceph-kvstore-tool are only relevant for osds with filestore-backend? - does ceph-kvstore-tool bluestore-kv … also deal with rocksdb-items for osds with bluestore-backend? thank you very much & with kind regards, thoralf.

3 years, 10 months

4
5
0 0

Cephadm cluster network

by jimmy.spets＠waochurch.com

I am new to Ceph so I hope this is not a question of me not reading the documentation well enough. I have setup a small cluster to learn with three physical hosts each with two nics. The cluster is upp and running but I have not figured out how to tie the OSD:s to my second interface for a separate cluster network, as it is now all communication goes thru the public network. Is it possible to define the cluster network with cephadm in some way? /Jimmy

3 years, 10 months

3
2
0 0

Cephadm Hangs During OSD Apply

by m＠silvenga.com

Hi, trying to migrate a second ceph cluster to Cephadm. All the host successfully migrated from "legacy" except one of the OSD hosts (cephadm kept duplicating osd ids e.g. two "osd.5", still not sure why). To make things easier, we re-provisioned the node (reinstalled from netinstall, applied the same SaltStack traits as the other nodes, wiped the disks) and tried to use cephadm to setup the OSD's. So, orch correctly starts the provisioning processes (a docker container running ceph-volume is created). But the provisioning never completes (docker exec): # ps axu root 1 0.1 0.2 99272 22488 ? Ss 15:26 0:01 /usr/libexec/platform-python -s /usr/sbin/ceph-volume lvm batch --no-auto /dev/sdb /dev/sdc --dmcrypt --yes --no-systemd root 807 0.9 0.5 154560 44120 ? S<L 15:26 0:06 /usr/sbin/cryptsetup --key-file - --allow-discards luksOpen /dev/ceph-851cae40-3270-45ea-b788-be6e05465e92/osd-data-e3157b54-f6b9-4ec9-ab12-e289f52c00a4 Afr6Ct-ok4h-pBEy-GfFF-xxYl-EKwi-cHhjZc # cat /var/log/ceph/ceph-volume.log Running command: /usr/sbin/cryptsetup --batch-mode --key-file - luksFormat /dev/ceph-851cae40-3270-45ea-b788-be6e05465e92/osd-data-e3157b54-f6b9-4ec9-ab12-e289f52c00a4 Running command: /usr/sbin/cryptsetup --key-file - --allow-discards luksOpen /dev/ceph-851cae40-3270-45ea-b788-be6e05465e92/osd-data-e3157b54-f6b9-4ec9-ab12-e289f52c00a4 Afr6Ct-ok4h-pBEy-GfFF-xxYl-EKwi-cHhjZc # docker ps 2956dec0450d ceph/ceph:v15 "/usr/sbin/ceph-volu…" 14 minutes ago Up 14 minutes condescending_nightingale # cat osd_spec_default.yaml service_type: osd service_id: osd_spec_default placement: host_pattern: '*' data_devices: all: true encrypted: true It looks like cephadm hangs on luksOpen. Is this expected (encryption is mentioned to be supported, outside of no documentation)?

3 years, 10 months

2
4
0 0

crashing OSDs: ceph_assert(h->file->fnode.ino != 1)

by Harald Staub

This is again about our bad cluster, with too much objects, and the hdd OSDs have a DB device that is (much) too small (e.g. 20 GB, i.e. 3 GB usable). Now several OSDs do not come up any more. Typical error message: /build/ceph-14.2.8/src/os/bluestore/BlueFS.cc: 2261: FAILED ceph_assert(h->file->fnode.ino != 1) Also just tried to add a few GB to the DB device (lvextend, ceph-bluestore-tool bluefs-bdev-expand), but this also crashes, also with this message. Options that helped us before (thanks Wido :-) do not help here, e.g. CEPH_ARGS="--bluestore-rocksdb-options compaction_readahead_size=0" ceph-kvstore-tool bluestore-kv /var/lib/ceph/osd/ceph-$OSD compact Any ideas that I could try to save these OSDs? Cheers Harry

3 years, 10 months

3
10
0 0

Re: ceph orch upgrade stuck at the beginning.

by Gencer W. Genç

Hi Sebastian, I know ceph doesn't meant for that. See, We have 3 clusters. 2 of them has 9 nodes each and 3 mons and 3 managers. Only one of them is 2 node. We use this 2-node only for testing and development purposes. We didn't want to spend more resources on test-only environment. Thank you so much once again for your great help! Gencer. On 6.06.2020 00:15:15, Sebastian Wagner <swagner(a)suse.com> wrote: Am 05.06.20 um 22:47 schrieb Gencer W. Genç: > Hi Sebastian, > > I have go ahead and dig into github.com/ceph source code. I see that > mons are grouped under name 'mon'. This makes me think that maybe > hostname is actually 'mon' rather than 'mon.abcx..'. So I gho ahead and try: > > ceph config set mon container_image docker.io/ceph/ceph:v15.2.3 > ceph orch redeploy mon > > Attention to *'mon'* instead of *'mon.hostname'*. > > What happend? Success. I've successfully upgraded both monitors to > v15.2.3. :) great! > > However, this time it says it is NOT safe to stop osd.9. > > I have 2 replicas, 2 node. Each node has 10 osd. However, ceph says it > is not safe to stop osd.9.due to: > > ceph osd ok-to-stop osd.9 > Error EBUSY: 16 PGs are already too degraded, would become too degraded > or might become unavailable might be fixable. Please have a look at your crush map and your failure domains. If you have a failure domain = host and a replica size of 3, Ceph isn't able to fulfil that request with just two hosts. > > I don't get it, If 2 replicas available why is it not safe to stop one > of them? > > Man, why is it so hard to upgrade 2-node with cephadm while ceph-deploy > does this seamlessly. Indeed. Ceph isn't meant to be used with just two nodes. mon quorun, replica size, failure domain. > > Thanks again and stay safe!, > Gencer. Thanks, you too! > > > >> On 5.06.2020 19:27:02, Gencer W. Genç wrote: >> >> Hi Sebastian, >> >> After I enabled debug mode and wait for another 10 minutes i see >> something like this: >> >> As you can see below ceph mon get returns still old value v15 image >> instead of v15.2.3 image. >> >> I've also attached whole mgr log to this email. >> >> LOGS: >> >> 2020-06-05T16:12:40.238+0000 7f752b27e700 0 log_channel(cephadm) log >> [DBG] : Have connection to vx-rg23-rk65-u43-130 >> 2020-06-05T16:12:40.242+0000 7f752b27e700 0 log_channel(cephadm) log >> [DBG] : mon_command: 'config get' -> 0 in 0.002s >> 2020-06-05T16:12:40.242+0000 7f752b27e700 0 log_channel(cephadm) log >> [DBG] : mon container image docker.io/ceph/ceph:v15 >> 2020-06-05T16:12:40.242+0000 7f752b27e700 0 log_channel(cephadm) log >> [DBG] : args: --image docker.io/ceph/ceph:v15 list-networks >> 2020-06-05T16:12:40.430+0000 7f752b27e700 0 log_channel(cephadm) log >> [DBG] : code: 0 >> 2020-06-05T16:12:40.430+0000 7f752b27e700 0 log_channel(cephadm) log >> [DBG] : out: { >> "5.254.67.88/30": [ >> "5.254.67.90" >> ], >> "5.254.96.128/29": [ >> "5.254.96.130" >> ], >> "37.221.171.144/30": [ >> "37.221.171.146" >> ], >> "93.115.82.132/30": [ >> "93.115.82.134" >> ], >> "109.163.234.32/30": [ >> "109.163.234.34" >> ], >> "172.17.0.0/16": [ >> "172.17.0.1" >> ], >> "192.168.0.0/24": [ >> "192.168.0.1" >> ] >> } >> 2020-06-05T16:12:40.430+0000 7f752b27e700 0 log_channel(cephadm) log >> [DBG] : Refreshed host vx-rg23-rk65-u43-130 devices (17) networks (7) >> 2020-06-05T16:12:40.454+0000 7f752b27e700 0 log_channel(cephadm) log >> [DBG] : checking vx-rg23-rk65-u43-130-1 >> 2020-06-05T16:12:40.454+0000 7f752b27e700 0 log_channel(cephadm) log >> [DBG] : Have connection to vx-rg23-rk65-u43-130-1 >> 2020-06-05T16:12:40.454+0000 7f752b27e700 0 log_channel(cephadm) log >> [DBG] : client container image None >> 2020-06-05T16:12:40.454+0000 7f752b27e700 0 log_channel(cephadm) log >> [DBG] : args: check-host >> 2020-06-05T16:12:40.982+0000 7f752b27e700 0 log_channel(cephadm) log >> [DBG] : code: 0 >> 2020-06-05T16:12:40.982+0000 7f752b27e700 0 log_channel(cephadm) log >> [DBG] : err: INFO:cephadm:podman|docker (/usr/bin/docker) is present >> INFO:cephadm:systemctl is present >> INFO:cephadm:lvcreate is present >> INFO:cephadm:Unit ntp.service is enabled and running >> INFO:cephadm:Host looks OK >> 2020-06-05T16:12:41.026+0000 7f752b27e700 0 log_channel(cephadm) log >> [DBG] : host vx-rg23-rk65-u43-130-1 ok >> 2020-06-05T16:12:41.026+0000 7f752b27e700 0 log_channel(cephadm) log >> [DBG] : refreshing vx-rg23-rk65-u43-130-1 daemons >> 2020-06-05T16:12:41.030+0000 7f752b27e700 0 log_channel(cephadm) log >> [DBG] : Have connection to vx-rg23-rk65-u43-130-1 >> 2020-06-05T16:12:41.030+0000 7f752b27e700 0 log_channel(cephadm) log >> [DBG] : mon_command: 'config get' -> 0 in 0.002s >> 2020-06-05T16:12:41.030+0000 7f752b27e700 0 log_channel(cephadm) log >> [DBG] : mon container image docker.io/ceph/ceph:v15 >> 2020-06-05T16:12:41.030+0000 7f752b27e700 0 log_channel(cephadm) log >> [DBG] : args: --image docker.io/ceph/ceph:v15 ls >> 2020-06-05T16:12:41.858+0000 7f750c3a4700 0 log_channel(cluster) log >> [DBG] : pgmap v376: 97 pgs: 97 active+clean; 4.8 GiB data, 11 GiB >> used, 87 TiB / 87 TiB avail >> 2020-06-05T16:12:43.862+0000 7f750c3a4700 0 log_channel(cluster) log >> [DBG] : pgmap v377: 97 pgs: 97 active+clean; 4.8 GiB data, 11 GiB >> used, 87 TiB / 87 TiB avail >> 2020-06-05T16:12:44.746+0000 7f752b27e700 0 log_channel(cephadm) log >> [DBG] : code: 0 >>> >>> On 5.06.2020 18:58:36, Gencer W. Genç wrote: >>> >>> I can send the whole mgr log if you want (277kb zipped) I couldn't >>> find anything useful. Maybe you can see something I don't see. Last >>> lines attached to this message. I see redeploy information log and >>> set command thats all. >>> >>> Once again, I run this commands. Maybe i am doing something wrong. I >>> set container_image property to docker/15.2.3. Is that right? >>> >>> ceph config \ >>> set mon.vx-rg23-rk65-u43-130 \ >>> container_image 'docker.io/ceph/ceph:v15.2.3' \ >>> --force # I even used force second time) >>> >>> Then immediately i run without waiting >>> ceph orch redeploy mon.vx-rg23-rk65-u43-130 >>> >>> After 20 mins I run: >>> >>> ceph orch ps | grep 'mon.' >>> >>> Result is same: >>> >>> mon.vx-rg23-rk65-u43-130 vx-rg23-rk65-u43-130 >>> running (3d) 5m ago 5w 15.2.1 docker.io/ceph/ceph:v15 >>> bc83a388465f 3c4695705fcb >>> >>> Should I enable debug logging and retry? >>> >>> Gencer. >>>> >>>> On 5.06.2020 18:42:23, Sebastian Wagner wrote: >>>> >>>> >>>> >>>> Am 05.06.20 um 17:29 schrieb Gencer W. Genç: >>>> > Thank you so much Sebastian! >>>> > >>>> > I really thankful to you for your help! >>>> > >>>> > I have did the mentioned workaround but "ceph orch ps" still shows >>>> old >>>> > container image versions I guess cephadm refresh takes an hour or >>>> more. >>>> >>>> no. should be 10 mins max. >>>> >>>> > I will wait till cephadm refresh images. Is this an expected delay & >>>> > behavior? (I am waiting since 20 mins) >>>> >>>> maximum 10 minutes. have a look at the "last refresh" column. >>>> >>>> > >>>> > What I did: >>>> > >>>> > ceph config *set* mon.vx-rg23-rk65-u43-130 *container_image* >>>> > docker.io/ceph/ceph:v15.2.3 >>>> > >>>> > and: >>>> > >>>> > ceph orch redeploy mon.vx-rg23-rk65-u43-130 >>>> > >>>> > I've awaited 20 mins then execute: >>>> > >>>> > ceph orch ps >>>> > >>>> > Still shows mon v15 instead of v15.2.3 image. >>>> > >>>> > mon.vx-rg23-rk65-u43-130 vx-rg23-rk65-u43-130 >>>> > running (3d) 7m ago 5w 15.2.1 * docker.io/ceph/ceph:v15* >>>> > bc83a388465f 3c4695705fcb >>>> > >>>> > If this takes long then ok. >>>> >>>> no, that shold work immediately. do you have the mgr logs? >>>> >>>> > >>>> > Thanks, >>>> > Gencer. >>>> > >>>> > >>>> >> On 5.06.2020 17:31:23, Sebastian Wagner wrote: >>>> >> >>>> >> https://tracker.ceph.com/issues/45896 >>>> >> >>>> >> sorry for not sending yout that per mail! >>>> >> >>>> >> Am 05.06.20 um 16:29 schrieb Gencer W. Genç: >>>> >> > Hi Sebastian, >>>> >> > >>>> >> > I hope im not taking your time on this. I just unable to >>>> continue on >>>> >> > upgrade my second cluster due to only available 2 nodes and 2 >>>> mons. >>>> >> > >>>> >> > /I have also one more cluster with 2-node only. But it uses >>>> old-style >>>> >> > ceph-deploy. Not cephadm and docker. Therefore I was able to >>>> upgrade >>>> >> > them easily./ >>>> >> > >>>> >> > However this cluster uses docker+cephadm and stuck at 15.2.1 >>>> due to 2 >>>> >> > mons available only. Isn't it possible to force / give >>>> instructions for >>>> >> > this case to continue on upgrade? >>>> >> > >>>> >> > Thanks, >>>> >> > Gencer. >>>> >> >> >>>> >> >> On 4.06.2020 15:42:38, Sebastian Wagner wrote: >>>> >> >> >>>> >> >> you'll at least need 3 monitors. how many do you have? >>>> >> >> >>>> >> >> Am 04.06.20 um 13:50 schrieb Gencer W. Genç: >>>> >> >> > Hi Sebastian, >>>> >> >> > >>>> >> >> > No worries about the delay. I just run that command however it >>>> >> returns: >>>> >> >> > >>>> >> >> > $ ceph mon ok-to-stop vx-rg23-rk65-u43-130 >>>> >> >> > >>>> >> >> > Error EBUSY: not enough monitors would be available >>>> >> >> (vx-rg23-rk65-u43-130-1) after stopping mons >>>> [vx-rg23-rk65-u43-130] >>>> >> >> > >>>> >> >> > It seems we have some progress here. In the past commands i got >>>> >> >> quorum. This time it acknowledges about monitor hostname but >>>> fail due >>>> >> >> to not enough monitors after stopping it. >>>> >> >> > >>>> >> >> > Any idea on this step? >>>> >> >> > >>>> >> >> > Thanks, >>>> >> >> > Gencer. >>>> >> >> > On 4.06.2020 13:20:09, Sebastian Wagner wrote: >>>> >> >> > sorry for the late response. >>>> >> >> > >>>> >> >> > I'm seeing >>>> >> >> > >>>> >> >> >> Upgrade: It is NOT safe to stop mon.vx-rg23-rk65-u43-130 >>>> >> >> > >>>> >> >> > in the logs. >>>> >> >> > >>>> >> >> > please make sure `ceph mon ok-to-stop vx-rg23-rk65-u43-130` >>>> >> >> > >>>> >> >> > succeeds. >>>> >> >> > >>>> >> >> > >>>> >> >> > >>>> >> >> > >>>> >> >> > >>>> >> >> > Am 22.05.20 um 19:28 schrieb Gencer W. Genç: >>>> >> >> >> Hi Sebastian, >>>> >> >> >> >>>> >> >> >> I cannot see my replies in here. So i put attachment as a body >>>> >> here: >>>> >> >> >> >>>> >> >> >> 2020-05-21T18:52:36.813+0000 7faf19f20040 0 set uid:gid to >>>> 167:167 >>>> >> >> (ceph:ceph) >>>> >> >> >> 2020-05-21T18:52:36.813+0000 7faf19f20040 0 ceph version >>>> 15.2.2 >>>> >> >> (0c857e985a29d90501a285f242ea9c008df49eb8) octopus (stable), >>>> process >>>> >> >> ceph-mgr, pid 1 >>>> >> >> >> 2020-05-21T18:52:36.817+0000 7faf19f20040 0 pidfile_write: >>>> ignore >>>> >> >> empty --pid-file >>>> >> >> >> 2020-05-21T18:52:36.853+0000 7faf19f20040 1 mgr[py] Loading >>>> python >>>> >> >> module 'alerts' >>>> >> >> >> 2020-05-21T18:52:36.957+0000 7faf19f20040 1 mgr[py] Loading >>>> python >>>> >> >> module 'balancer' >>>> >> >> >> 2020-05-21T18:52:37.029+0000 7faf19f20040 1 mgr[py] Loading >>>> python >>>> >> >> module 'cephadm' >>>> >> >> >> 2020-05-21T18:52:37.237+0000 7faf19f20040 1 mgr[py] Loading >>>> python >>>> >> >> module 'crash' >>>> >> >> >> 2020-05-21T18:52:37.333+0000 7faf19f20040 1 mgr[py] Loading >>>> python >>>> >> >> module 'dashboard' >>>> >> >> >> 2020-05-21T18:52:37.981+0000 7faf19f20040 1 mgr[py] Loading >>>> python >>>> >> >> module 'devicehealth' >>>> >> >> >> 2020-05-21T18:52:38.045+0000 7faf19f20040 1 mgr[py] Loading >>>> python >>>> >> >> module 'diskprediction_local' >>>> >> >> >> 2020-05-21T18:52:38.221+0000 7faf19f20040 1 mgr[py] Loading >>>> python >>>> >> >> module 'influx' >>>> >> >> >> 2020-05-21T18:52:38.293+0000 7faf19f20040 1 mgr[py] Loading >>>> python >>>> >> >> module 'insights' >>>> >> >> >> 2020-05-21T18:52:38.425+0000 7faf19f20040 1 mgr[py] Loading >>>> python >>>> >> >> module 'iostat' >>>> >> >> >> 2020-05-21T18:52:38.489+0000 7faf19f20040 1 mgr[py] Loading >>>> python >>>> >> >> module 'k8sevents' >>>> >> >> >> 2020-05-21T18:52:39.077+0000 7faf19f20040 1 mgr[py] Loading >>>> python >>>> >> >> module 'localpool' >>>> >> >> >> 2020-05-21T18:52:39.133+0000 7faf19f20040 1 mgr[py] Loading >>>> python >>>> >> >> module 'orchestrator' >>>> >> >> >> 2020-05-21T18:52:39.277+0000 7faf19f20040 1 mgr[py] Loading >>>> python >>>> >> >> module 'osd_support' >>>> >> >> >> 2020-05-21T18:52:39.433+0000 7faf19f20040 1 mgr[py] Loading >>>> python >>>> >> >> module 'pg_autoscaler' >>>> >> >> >> 2020-05-21T18:52:39.545+0000 7faf19f20040 1 mgr[py] Loading >>>> python >>>> >> >> module 'progress' >>>> >> >> >> 2020-05-21T18:52:39.633+0000 7faf19f20040 1 mgr[py] Loading >>>> python >>>> >> >> module 'prometheus' >>>> >> >> >> 2020-05-21T18:52:40.013+0000 7faf19f20040 1 mgr[py] Loading >>>> python >>>> >> >> module 'rbd_support' >>>> >> >> >> 2020-05-21T18:52:40.253+0000 7faf19f20040 1 mgr[py] Loading >>>> python >>>> >> >> module 'restful' >>>> >> >> >> 2020-05-21T18:52:40.553+0000 7faf19f20040 1 mgr[py] Loading >>>> python >>>> >> >> module 'rook' >>>> >> >> >> 2020-05-21T18:52:41.229+0000 7faf19f20040 1 mgr[py] Loading >>>> python >>>> >> >> module 'selftest' >>>> >> >> >> 2020-05-21T18:52:41.285+0000 7faf19f20040 1 mgr[py] Loading >>>> python >>>> >> >> module 'status' >>>> >> >> >> 2020-05-21T18:52:41.357+0000 7faf19f20040 1 mgr[py] Loading >>>> python >>>> >> >> module 'telegraf' >>>> >> >> >> 2020-05-21T18:52:41.421+0000 7faf19f20040 1 mgr[py] Loading >>>> python >>>> >> >> module 'telemetry' >>>> >> >> >> 2020-05-21T18:52:41.581+0000 7faf19f20040 1 mgr[py] Loading >>>> python >>>> >> >> module 'test_orchestrator' >>>> >> >> >> 2020-05-21T18:52:41.937+0000 7faf19f20040 1 mgr[py] Loading >>>> python >>>> >> >> module 'volumes' >>>> >> >> >> 2020-05-21T18:52:42.121+0000 7faf19f20040 1 mgr[py] Loading >>>> python >>>> >> >> module 'zabbix' >>>> >> >> >> 2020-05-21T18:52:42.189+0000 7faf06a1a700 0 >>>> ms_deliver_dispatch: >>>> >> >> unhandled message 0x556226c8e6e0 mon_map magic: 0 v1 from mon.1 >>>> >> >> v2:192.168.0.3:3300/0 >>>> >> >> >> 2020-05-21T18:52:43.557+0000 7faf06a1a700 1 mgr handle_mgr_map >>>> >> >> Activating! >>>> >> >> >> 2020-05-21T18:52:43.557+0000 7faf06a1a700 1 mgr >>>> handle_mgr_map I am >>>> >> >> now activating >>>> >> >> >> 2020-05-21T18:52:43.665+0000 7faed44a7700 0 [balancer DEBUG >>>> root] >>>> >> >> setting log level based on debug_mgr: WARNING (1/5) >>>> >> >> >> 2020-05-21T18:52:43.665+0000 7faed44a7700 1 mgr load >>>> Constructed >>>> >> >> class from module: balancer >>>> >> >> >> 2020-05-21T18:52:43.665+0000 7faed44a7700 0 [cephadm DEBUG >>>> root] >>>> >> >> setting log level based on debug_mgr: WARNING (1/5) >>>> >> >> >> 2020-05-21T18:52:43.689+0000 7faed44a7700 1 mgr load >>>> Constructed >>>> >> >> class from module: cephadm >>>> >> >> >> 2020-05-21T18:52:43.689+0000 7faed44a7700 0 [crash DEBUG root] >>>> >> >> setting log level based on debug_mgr: WARNING (1/5) >>>> >> >> >> 2020-05-21T18:52:43.689+0000 7faed44a7700 1 mgr load >>>> Constructed >>>> >> >> class from module: crash >>>> >> >> >> 2020-05-21T18:52:43.693+0000 7faed44a7700 0 [dashboard >>>> DEBUG root] >>>> >> >> setting log level based on debug_mgr: WARNING (1/5) >>>> >> >> >> 2020-05-21T18:52:43.693+0000 7faed44a7700 1 mgr load >>>> Constructed >>>> >> >> class from module: dashboard >>>> >> >> >> 2020-05-21T18:52:43.693+0000 7faed44a7700 0 [devicehealth >>>> DEBUG >>>> >> >> root] setting log level based on debug_mgr: WARNING (1/5) >>>> >> >> >> 2020-05-21T18:52:43.693+0000 7faed44a7700 1 mgr load >>>> Constructed >>>> >> >> class from module: devicehealth >>>> >> >> >> 2020-05-21T18:52:43.701+0000 7faed44a7700 0 [iostat DEBUG >>>> root] >>>> >> >> setting log level based on debug_mgr: WARNING (1/5) >>>> >> >> >> 2020-05-21T18:52:43.701+0000 7faed44a7700 1 mgr load >>>> Constructed >>>> >> >> class from module: iostat >>>> >> >> >> 2020-05-21T18:52:43.709+0000 7faed44a7700 0 [orchestrator >>>> DEBUG >>>> >> >> root] setting log level based on debug_mgr: WARNING (1/5) >>>> >> >> >> 2020-05-21T18:52:43.709+0000 7faed44a7700 1 mgr load >>>> Constructed >>>> >> >> class from module: orchestrator >>>> >> >> >> 2020-05-21T18:52:43.717+0000 7faed44a7700 0 [osd_support DEBUG >>>> >> >> root] setting log level based on debug_mgr: WARNING (1/5) >>>> >> >> >> 2020-05-21T18:52:43.717+0000 7faed44a7700 1 mgr load >>>> Constructed >>>> >> >> class from module: osd_support >>>> >> >> >> 2020-05-21T18:52:43.717+0000 7faed44a7700 0 [pg_autoscaler >>>> DEBUG >>>> >> >> root] setting log level based on debug_mgr: WARNING (1/5) >>>> >> >> >> 2020-05-21T18:52:43.721+0000 7faed44a7700 1 mgr load >>>> Constructed >>>> >> >> class from module: pg_autoscaler >>>> >> >> >> 2020-05-21T18:52:43.721+0000 7faed44a7700 0 [progress DEBUG >>>> root] >>>> >> >> setting log level based on debug_mgr: WARNING (1/5) >>>> >> >> >> 2020-05-21T18:52:43.721+0000 7faed44a7700 1 mgr load >>>> Constructed >>>> >> >> class from module: progress >>>> >> >> >> 2020-05-21T18:52:43.729+0000 7faed44a7700 0 [prometheus >>>> DEBUG root] >>>> >> >> setting log level based on debug_mgr: WARNING (1/5) >>>> >> >> >> 2020-05-21T18:52:43.729+0000 7faed44a7700 1 mgr load >>>> Constructed >>>> >> >> class from module: prometheus >>>> >> >> >> 2020-05-21T18:52:43.733+0000 7faed44a7700 0 [rbd_support DEBUG >>>> >> >> root] setting log level based on debug_mgr: WARNING (1/5) >>>> >> >> >> 2020-05-21T18:52:44.761+0000 7faed4ca8700 0 >>>> log_channel(cluster) >>>> >> >> log [DBG] : pgmap v4: 97 pgs: 33 undersized+peered, 64 >>>> >> >> undersized+degraded+peered; 4.8 GiB data, 5.8 GiB used, 44 TiB >>>> / 44 >>>> >> >> TiB avail; 1379/2758 objects degraded (50.000%) >>>> >> >> >> 2020-05-21T18:52:45.641+0000 7faed5caa700 0 >>>> log_channel(cluster) >>>> >> >> log [DBG] : pgmap v5: 97 pgs: 33 undersized+peered, 64 >>>> >> >> undersized+degraded+peered; 4.8 GiB data, 6.4 GiB used, 47 TiB >>>> / 47 >>>> >> >> TiB avail; 1379/2758 objects degraded (50.000%) >>>> >> >> >> 2020-05-21T18:52:47.645+0000 7faed5caa700 0 >>>> log_channel(cluster) >>>> >> >> log [DBG] : pgmap v6: 97 pgs: 33 undersized+peered, 64 >>>> >> >> undersized+degraded+peered; 4.8 GiB data, 6.4 GiB used, 47 TiB >>>> / 47 >>>> >> >> TiB avail; 1379/2758 objects degraded (50.000%) >>>> >> >> >> 2020-05-21T18:52:49.645+0000 7faed5caa700 0 >>>> log_channel(cluster) >>>> >> >> log [DBG] : pgmap v9: 97 pgs: 33 undersized+peered, 64 >>>> >> >> undersized+degraded+peered; 4.8 GiB data, 11 GiB used, 80 TiB >>>> / 80 TiB >>>> >> >> avail; 1379/2758 objects degraded (50.000%) >>>> >> >> >> 2020-05-21T18:52:49.805+0000 7faed4ca8700 0 >>>> log_channel(audit) log >>>> >> >> [DBG] : from='client.134148 -' entity='client.admin' >>>> cmd=[{"prefix": >>>> >> >> "orch upgrade start", "ceph_version": "15.2.2", "target": >>>> ["mon-mgr", >>>> >> >> ""]}]: dispatch >>>> >> >> >> 2020-05-21T18:52:51.645+0000 7faed5caa700 0 >>>> log_channel(cluster) >>>> >> >> log [DBG] : pgmap v12: 97 pgs: 10 active+clean, 20 >>>> undersized+peered, >>>> >> >> 27 peering, 40 undersized+degraded+peered; 4.8 GiB data, 12 >>>> GiB used, >>>> >> >> 87 TiB / 87 TiB avail; 889/2758 objects degraded (32.234%) >>>> >> >> >> 2020-05-21T18:52:51.817+0000 7faed44a7700 1 mgr load >>>> Constructed >>>> >> >> class from module: rbd_support >>>> >> >> >> 2020-05-21T18:52:51.817+0000 7faed44a7700 0 [restful DEBUG >>>> root] >>>> >> >> setting log level based on debug_mgr: WARNING (1/5) >>>> >> >> >> 2020-05-21T18:52:51.817+0000 7faed44a7700 1 mgr load >>>> Constructed >>>> >> >> class from module: restful >>>> >> >> >> 2020-05-21T18:52:51.817+0000 7faed44a7700 0 [status DEBUG >>>> root] >>>> >> >> setting log level based on debug_mgr: WARNING (1/5) >>>> >> >> >> 2020-05-21T18:52:51.817+0000 7faed44a7700 1 mgr load >>>> Constructed >>>> >> >> class from module: status >>>> >> >> >> 2020-05-21T18:52:51.821+0000 7faecb0fb700 0 [restful >>>> WARNING root] >>>> >> >> server not running: no certificate configured >>>> >> >> >> 2020-05-21T18:52:51.821+0000 7faed44a7700 0 [telemetry >>>> DEBUG root] >>>> >> >> setting log level based on debug_mgr: WARNING (1/5) >>>> >> >> >> 2020-05-21T18:52:51.825+0000 7faed44a7700 1 mgr load >>>> Constructed >>>> >> >> class from module: telemetry >>>> >> >> >> 2020-05-21T18:52:51.825+0000 7faed44a7700 0 [volumes DEBUG >>>> root] >>>> >> >> setting log level based on debug_mgr: WARNING (1/5) >>>> >> >> >> 2020-05-21T18:52:51.837+0000 7faed44a7700 1 mgr load >>>> Constructed >>>> >> >> class from module: volumes >>>> >> >> >> 2020-05-21T18:52:51.853+0000 7faec48ee700 -1 client.0 error >>>> >> >> registering admin socket command: (17) File exists >>>> >> >> >> 2020-05-21T18:52:51.853+0000 7faec48ee700 -1 client.0 error >>>> >> >> registering admin socket command: (17) File exists >>>> >> >> >> 2020-05-21T18:52:51.853+0000 7faec48ee700 -1 client.0 error >>>> >> >> registering admin socket command: (17) File exists >>>> >> >> >> 2020-05-21T18:52:51.853+0000 7faec48ee700 -1 client.0 error >>>> >> >> registering admin socket command: (17) File exists >>>> >> >> >> 2020-05-21T18:52:51.853+0000 7faec48ee700 -1 client.0 error >>>> >> >> registering admin socket command: (17) File exists >>>> >> >> >> 2020-05-21T18:52:53.645+0000 7faed5caa700 0 >>>> log_channel(cluster) >>>> >> >> log [DBG] : pgmap v14: 97 pgs: 10 active+clean, 20 >>>> undersized+peered, >>>> >> >> 27 peering, 40 undersized+degraded+peered; 4.8 GiB data, 12 >>>> GiB used, >>>> >> >> 87 TiB / 87 TiB avail; 889/2758 objects degraded (32.234%) >>>> >> >> >> 2020-05-21T18:52:55.053+0000 7faee6b7d700 0 >>>> log_channel(cephadm) >>>> >> >> log [INF] : Upgrade: First pull of docker.io/ceph/ceph:v15.2.2 >>>> >> >> >> 2020-05-21T18:52:55.649+0000 7faed5caa700 0 >>>> log_channel(cluster) >>>> >> >> log [DBG] : pgmap v15: 97 pgs: 97 active+clean; 4.8 GiB data, >>>> 12 GiB >>>> >> >> used, 87 TiB / 87 TiB avail; 180 KiB/s rd, 5.5 KiB/s wr, 19 op/s >>>> >> >> >> 2020-05-21T18:52:57.649+0000 7faed5caa700 0 >>>> log_channel(cluster) >>>> >> >> log [DBG] : pgmap v16: 97 pgs: 97 active+clean; 4.8 GiB data, >>>> 12 GiB >>>> >> >> used, 87 TiB / 87 TiB avail; 135 KiB/s rd, 4.1 KiB/s wr, 14 op/s >>>> >> >> >> 2020-05-21T18:52:59.649+0000 7faed5caa700 0 >>>> log_channel(cluster) >>>> >> >> log [DBG] : pgmap v17: 97 pgs: 97 active+clean; 4.8 GiB data, >>>> 12 GiB >>>> >> >> used, 87 TiB / 87 TiB avail; 123 KiB/s rd, 4.0 KiB/s wr, 15 op/s >>>> >> >> >> 2020-05-21T18:53:01.133+0000 7faee6b7d700 0 >>>> log_channel(cephadm) >>>> >> >> log [INF] : Upgrade: Target is docker.io/ceph/ceph:v15.2.2 >>>> with id >>>> >> >> 4569944bb86c3f9b5286057a558a3f852156079f759c9734e54d4f64092be9fa >>>> >> >> >> 2020-05-21T18:53:01.137+0000 7faee6b7d700 0 >>>> log_channel(cephadm) >>>> >> >> log [INF] : Upgrade: Checking mgr daemons... >>>> >> >> >> 2020-05-21T18:53:01.141+0000 7faee6b7d700 0 >>>> log_channel(cephadm) >>>> >> >> log [INF] : Upgrade: All mgr daemons are up to date. >>>> >> >> >> 2020-05-21T18:53:01.141+0000 7faee6b7d700 0 >>>> log_channel(cephadm) >>>> >> >> log [INF] : Upgrade: Checking mon daemons... >>>> >> >> >> 2020-05-21T18:53:01.649+0000 7faed5caa700 0 >>>> log_channel(cluster) >>>> >> >> log [DBG] : pgmap v18: 97 pgs: 97 active+clean; 4.8 GiB data, >>>> 12 GiB >>>> >> >> used, 87 TiB / 87 TiB avail; 111 KiB/s rd, 3.6 KiB/s wr, 15 op/s >>>> >> >> >> 2020-05-21T18:53:02.381+0000 7faee6b7d700 0 >>>> log_channel(cephadm) >>>> >> >> log [INF] : Upgrade: It is NOT safe to stop >>>> mon.vx-rg23-rk65-u43-130 >>>> >> >> >> 2020-05-21T18:53:03.653+0000 7faed5caa700 0 >>>> log_channel(cluster) >>>> >> >> log [DBG] : pgmap v19: 97 pgs: 97 active+clean; 4.8 GiB data, >>>> 12 GiB >>>> >> >> used, 87 TiB / 87 TiB avail; 93 KiB/s rd, 3.0 KiB/s wr, 12 op/s >>>> >> >> >> 2020-05-21T18:53:05.653+0000 7faed5caa700 0 >>>> log_channel(cluster) >>>> >> >> log [DBG] : pgmap v20: 97 pgs: 97 active+clean; 4.8 GiB data, >>>> 12 GiB >>>> >> >> used, 87 TiB / 87 TiB avail; 93 KiB/s rd, 4.6 KiB/s wr, 13 op/s >>>> >> >> >> 2020-05-21T18:53:07.653+0000 7faed5caa700 0 >>>> log_channel(cluster) >>>> >> >> log [DBG] : pgmap v21: 97 pgs: 97 active+clean; 4.8 GiB data, >>>> 12 GiB >>>> >> >> used, 87 TiB / 87 TiB avail; 2.7 KiB/s rd, 1.8 KiB/s wr, 3 op/s >>>> >> >> >> 2020-05-21T18:53:09.658+0000 7faed5caa700 0 >>>> log_channel(cluster) >>>> >> >> log [DBG] : pgmap v22: 97 pgs: 97 active+clean; 4.8 GiB data, >>>> 12 GiB >>>> >> >> used, 87 TiB / 87 TiB avail; 2.7 KiB/s rd, 1.8 KiB/s wr, 3 op/s >>>> >> >> >> 2020-05-21T18:53:11.658+0000 7faed5caa700 0 >>>> log_channel(cluster) >>>> >> >> log [DBG] : pgmap v23: 97 pgs: 97 active+clean; 4.8 GiB data, >>>> 12 GiB >>>> >> >> used, 87 TiB / 87 TiB avail; 1023 B/s rd, 1.6 KiB/s wr, 1 op/s >>>> >> >> >> 2020-05-21T18:53:13.658+0000 7faed5caa700 0 >>>> log_channel(cluster) >>>> >> >> log [DBG] : pgmap v24: 97 pgs: 97 active+clean; 4.8 GiB data, >>>> 12 GiB >>>> >> >> used, 87 TiB / 87 TiB avail; 1.6 KiB/s wr, 0 op/s >>>> >> >> >> 2020-05-21T18:53:15.658+0000 7faed5caa700 0 >>>> log_channel(cluster) >>>> >> >> log [DBG] : pgmap v25: 97 pgs: 97 active+clean; 4.8 GiB data, >>>> 12 GiB >>>> >> >> used, 87 TiB / 87 TiB avail; 1.8 KiB/s wr, 1 op/s >>>> >> >> >> 2020-05-21T18:53:17.402+0000 7faee6b7d700 0 >>>> log_channel(cephadm) >>>> >> >> log [INF] : Upgrade: It is NOT safe to stop >>>> mon.vx-rg23-rk65-u43-130 >>>> >> >> >> 2020-05-21T18:53:17.658+0000 7faed5caa700 0 >>>> log_channel(cluster) >>>> >> >> log [DBG] : pgmap v26: 97 pgs: 97 active+clean; 4.8 GiB data, >>>> 12 GiB >>>> >> >> used, 87 TiB / 87 TiB avail; 255 B/s wr, 0 op/s >>>> >> >> >> 2020-05-21T18:53:19.662+0000 7faed5caa700 0 >>>> log_channel(cluster) >>>> >> >> log [DBG] : pgmap v27: 97 pgs: 97 active+clean; 4.8 GiB data, >>>> 12 GiB >>>> >> >> used, 87 TiB / 87 TiB avail; 255 B/s wr, 0 op/s >>>> >> >> >> 2020-05-21T18:53:21.662+0000 7faed5caa700 0 >>>> log_channel(cluster) >>>> >> >> log [DBG] : pgmap v28: 97 pgs: 97 active+clean; 4.8 GiB data, >>>> 12 GiB >>>> >> >> used, 87 TiB / 87 TiB avail; 255 B/s wr, 0 op/s >>>> >> >> >> 2020-05-21T18:53:23.662+0000 7faed5caa700 0 >>>> log_channel(cluster) >>>> >> >> log [DBG] : pgmap v29: 97 pgs: 97 active+clean; 4.8 GiB data, >>>> 12 GiB >>>> >> >> used, 87 TiB / 87 TiB avail; 255 B/s wr, 0 op/s >>>> >> >> >> 2020-05-21T18:53:25.662+0000 7faed5caa700 0 >>>> log_channel(cluster) >>>> >> >> log [DBG] : pgmap v30: 97 pgs: 97 active+clean; 4.8 GiB data, >>>> 12 GiB >>>> >> >> used, 87 TiB / 87 TiB avail; 5.7 KiB/s wr, 0 op/s >>>> >> >> >> 2020-05-21T18:53:27.666+0000 7faed5caa700 0 >>>> log_channel(cluster) >>>> >> >> log [DBG] : pgmap v31: 97 pgs: 97 active+clean; 4.8 GiB data, >>>> 12 GiB >>>> >> >> used, 87 TiB / 87 TiB avail; 5.4 KiB/s wr, 0 op/s >>>> >> >> >> 2020-05-21T18:53:29.666+0000 7faed5caa700 0 >>>> log_channel(cluster) >>>> >> >> log [DBG] : pgmap v32: 97 pgs: 97 active+clean; 4.8 GiB data, >>>> 12 GiB >>>> >> >> used, 87 TiB / 87 TiB avail; 5.4 KiB/s wr, 0 op/s >>>> >> >> >> 2020-05-21T18:53:31.666+0000 7faed5caa700 0 >>>> log_channel(cluster) >>>> >> >> log [DBG] : pgmap v33: 97 pgs: 97 active+clean; 4.8 GiB data, >>>> 12 GiB >>>> >> >> used, 87 TiB / 87 TiB avail; 5.4 KiB/s wr, 0 op/s >>>> >> >> >> 2020-05-21T18:53:32.414+0000 7faee6b7d700 0 >>>> log_channel(cephadm) >>>> >> >> log [INF] : Upgrade: It is NOT safe to stop >>>> mon.vx-rg23-rk65-u43-130 >>>> >> >> >> 2020-05-21T18:53:33.666+0000 7faed5caa700 0 >>>> log_channel(cluster) >>>> >> >> log [DBG] : pgmap v34: 97 pgs: 97 active+clean; 4.8 GiB data, >>>> 12 GiB >>>> >> >> used, 87 TiB / 87 TiB avail; 5.4 KiB/s wr, 0 op/s >>>> >> >> >> 2020-05-21T18:53:35.670+0000 7faed5caa700 0 >>>> log_channel(cluster) >>>> >> >> log [DBG] : pgmap v35: 97 pgs: 97 active+clean; 4.8 GiB data, >>>> 12 GiB >>>> >> >> used, 87 TiB / 87 TiB avail; 5.4 KiB/s wr, 0 op/s >>>> >> >> >> 2020-05-21T18:53:37.670+0000 7faed5caa700 0 >>>> log_channel(cluster) >>>> >> >> log [DBG] : pgmap v36: 97 pgs: 97 active+clean; 4.8 GiB data, >>>> 12 GiB >>>> >> >> used, 87 TiB / 87 TiB avail >>>> >> >> >> 2020-05-21T18:53:39.670+0000 7faed5caa700 0 >>>> log_channel(cluster) >>>> >> >> log [DBG] : pgmap v37: 97 pgs: 97 active+clean; 4.8 GiB data, >>>> 12 GiB >>>> >> >> used, 87 TiB / 87 TiB avail >>>> >> >> >> 2020-05-21T18:53:41.670+0000 7faed5caa700 0 >>>> log_channel(cluster) >>>> >> >> log [DBG] : pgmap v38: 97 pgs: 97 active+clean; 4.8 GiB data, >>>> 12 GiB >>>> >> >> used, 87 TiB / 87 TiB avail >>>> >> >> >> 2020-05-21T18:53:43.670+0000 7faed5caa700 0 >>>> log_channel(cluster) >>>> >> >> log [DBG] : pgmap v39: 97 pgs: 97 active+clean; 4.8 GiB data, >>>> 12 GiB >>>> >> >> used, 87 TiB / 87 TiB avail >>>> >> >> >> 2020-05-21T18:53:45.674+0000 7faed5caa700 0 >>>> log_channel(cluster) >>>> >> >> log [DBG] : pgmap v40: 97 pgs: 97 active+clean; 4.8 GiB data, >>>> 12 GiB >>>> >> >> used, 87 TiB / 87 TiB avail >>>> >> >> >> 2020-05-21T18:53:47.430+0000 7faee6b7d700 0 >>>> log_channel(cephadm) >>>> >> >> log [INF] : Upgrade: It is NOT safe to stop >>>> mon.vx-rg23-rk65-u43-130 >>>> >> >> >> 2020-05-21T18:53:47.674+0000 7faed5caa700 0 >>>> log_channel(cluster) >>>> >> >> log [DBG] : pgmap v41: 97 pgs: 97 active+clean; 4.8 GiB data, >>>> 12 GiB >>>> >> >> used, 87 TiB / 87 TiB avail >>>> >> >> >> 2020-05-21T18:53:49.674+0000 7faed5caa700 0 >>>> log_channel(cluster) >>>> >> >> log [DBG] : pgmap v42: 97 pgs: 97 active+clean; 4.8 GiB data, >>>> 12 GiB >>>> >> >> used, 87 TiB / 87 TiB avail >>>> >> >> >> 2020-05-21T18:53:51.674+0000 7faed5caa700 0 >>>> log_channel(cluster) >>>> >> >> log [DBG] : pgmap v43: 97 pgs: 97 active+clean; 4.8 GiB data, >>>> 12 GiB >>>> >> >> used, 87 TiB / 87 TiB avail >>>> >> >> >> 2020-05-21T18:53:53.678+0000 7faed5caa700 0 >>>> log_channel(cluster) >>>> >> >> log [DBG] : pgmap v44: 97 pgs: 97 active+clean; 4.8 GiB data, >>>> 12 GiB >>>> >> >> used, 87 TiB / 87 TiB avail >>>> >> >> >> 2020-05-21T18:53:55.678+0000 7faed5caa700 0 >>>> log_channel(cluster) >>>> >> >> log [DBG] : pgmap v45: 97 pgs: 97 active+clean; 4.8 GiB data, >>>> 12 GiB >>>> >> >> used, 87 TiB / 87 TiB avail; 5.4 KiB/s wr, 0 op/s >>>> >> >> >> 2020-05-21T18:53:57.678+0000 7faed5caa700 0 >>>> log_channel(cluster) >>>> >> >> log [DBG] : pgmap v46: 97 pgs: 97 active+clean; 4.8 GiB data, >>>> 12 GiB >>>> >> >> used, 87 TiB / 87 TiB avail; 5.4 KiB/s wr, 0 op/s >>>> >> >> >> 2020-05-21T18:53:59.678+0000 7faed5caa700 0 >>>> log_channel(cluster) >>>> >> >> log [DBG] : pgmap v47: 97 pgs: 97 active+clean; 4.8 GiB data, >>>> 12 GiB >>>> >> >> used, 87 TiB / 87 TiB avail; 5.4 KiB/s wr, 0 op/s >>>> >> >> >> 2020-05-21T18:54:01.682+0000 7faed5caa700 0 >>>> log_channel(cluster) >>>> >> >> log [DBG] : pgmap v48: 97 pgs: 97 active+clean; 4.8 GiB data, >>>> 12 GiB >>>> >> >> used, 87 TiB / 87 TiB avail; 5.4 KiB/s wr, 0 op/s >>>> >> >> >> 2020-05-21T18:54:02.454+0000 7faee6b7d700 0 >>>> log_channel(cephadm) >>>> >> >> log [INF] : Upgrade: Target is docker.io/ceph/ceph:v15.2.2 >>>> with id >>>> >> >> 4569944bb86c3f9b5286057a558a3f852156079f759c9734e54d4f64092be9fa >>>> >> >> >> 2020-05-21T18:54:02.458+0000 7faee6b7d700 0 >>>> log_channel(cephadm) >>>> >> >> log [INF] : Upgrade: Checking mgr daemons... >>>> >> >> >> 2020-05-21T18:54:02.458+0000 7faee6b7d700 0 >>>> log_channel(cephadm) >>>> >> >> log [INF] : Upgrade: All mgr daemons are up to date. >>>> >> >> >> 2020-05-21T18:54:02.458+0000 7faee6b7d700 0 >>>> log_channel(cephadm) >>>> >> >> log [INF] : Upgrade: Checking mon daemons... >>>> >> >> >> 2020-05-21T18:54:03.614+0000 7faee6b7d700 0 >>>> log_channel(cephadm) >>>> >> >> log [INF] : Upgrade: It is NOT safe to stop >>>> mon.vx-rg23-rk65-u43-130 >>>> >> >> >> 2020-05-21T18:54:03.682+0000 7faed5caa700 0 >>>> log_channel(cluster) >>>> >> >> log [DBG] : pgmap v49: 97 pgs: 97 active+clean; 4.8 GiB data, >>>> 12 GiB >>>> >> >> used, 87 TiB / 87 TiB avail; 5.4 KiB/s wr, 0 op/s >>>> >> >> >> 2020-05-21T18:54:05.682+0000 7faed5caa700 0 >>>> log_channel(cluster) >>>> >> >> log [DBG] : pgmap v50: 97 pgs: 97 active+clean; 4.8 GiB data, >>>> 12 GiB >>>> >> >> used, 87 TiB / 87 TiB avail; 5.4 KiB/s wr, 0 op/s >>>> >> >> >> 2020-05-21T18:54:07.682+0000 7faed5caa700 0 >>>> log_channel(cluster) >>>> >> >> log [DBG] : pgmap v51: 97 pgs: 97 active+clean; 4.8 GiB data, >>>> 12 GiB >>>> >> >> used, 87 TiB / 87 TiB avail >>>> >> >> >> 2020-05-21T18:54:09.686+0000 7faed5caa700 0 >>>> log_channel(cluster) >>>> >> >> log [DBG] : pgmap v52: 97 pgs: 97 active+clean; 4.8 GiB data, >>>> 12 GiB >>>> >> >> used, 87 TiB / 87 TiB avail >>>> >> >> >> 2020-05-21T18:54:11.686+0000 7faed5caa700 0 >>>> log_channel(cluster) >>>> >> >> log [DBG] : pgmap v53: 97 pgs: 97 active+clean; 4.8 GiB data, >>>> 12 GiB >>>> >> >> used, 87 TiB / 87 TiB avail >>>> >> >> >> 2020-05-21T18:54:13.690+0000 7faed5caa700 0 >>>> log_channel(cluster) >>>> >> >> log [DBG] : pgmap v54: 97 pgs: 97 active+clean; 4.8 GiB data, >>>> 12 GiB >>>> >> >> used, 87 TiB / 87 TiB avail >>>> >> >> >> 2020-05-21T18:54:15.691+0000 7faed5caa700 0 >>>> log_channel(cluster) >>>> >> >> log [DBG] : pgmap v55: 97 pgs: 97 active+clean; 4.8 GiB data, >>>> 12 GiB >>>> >> >> used, 87 TiB / 87 TiB avail >>>> >> >> >> 2020-05-21T18:54:17.691+0000 7faed5caa700 0 >>>> log_channel(cluster) >>>> >> >> log [DBG] : pgmap v56: 97 pgs: 97 active+clean; 4.8 GiB data, >>>> 12 GiB >>>> >> >> used, 87 TiB / 87 TiB avail >>>> >> >> >> 2020-05-21T18:54:18.631+0000 7faee6b7d700 0 >>>> log_channel(cephadm) >>>> >> >> log [INF] : Upgrade: It is NOT safe to stop >>>> mon.vx-rg23-rk65-u43-130 >>>> >> >> >> 2020-05-21T18:54:19.691+0000 7faed5caa700 0 >>>> log_channel(cluster) >>>> >> >> log [DBG] : pgmap v57: 97 pgs: 97 active+clean; 4.8 GiB data, >>>> 12 GiB >>>> >> >> used, 87 TiB / 87 TiB avail >>>> >> >> >> 2020-05-21T18:54:21.691+0000 7faed5caa700 0 >>>> log_channel(cluster) >>>> >> >> log [DBG] : pgmap v58: 97 pgs: 97 active+clean; 4.8 GiB data, >>>> 12 GiB >>>> >> >> used, 87 TiB / 87 TiB avail >>>> >> >> >> 2020-05-21T18:54:23.691+0000 7faed5caa700 0 >>>> log_channel(cluster) >>>> >> >> log [DBG] : pgmap v59: 97 pgs: 97 active+clean; 4.8 GiB data, >>>> 12 GiB >>>> >> >> used, 87 TiB / 87 TiB avail >>>> >> >> >> 2020-05-21T18:54:25.695+0000 7faed5caa700 0 >>>> log_channel(cluster) >>>> >> >> log [DBG] : pgmap v60: 97 pgs: 97 active+clean; 4.8 GiB data, >>>> 12 GiB >>>> >> >> used, 87 TiB / 87 TiB avail; 170 B/s wr, 0 op/s >>>> >> >> >> 2020-05-21T18:54:27.695+0000 7faed5caa700 0 >>>> log_channel(cluster) >>>> >> >> log [DBG] : pgmap v61: 97 pgs: 97 active+clean; 4.8 GiB data, >>>> 12 GiB >>>> >> >> used, 87 TiB / 87 TiB avail; 170 B/s wr, 0 op/s >>>> >> >> >> 2020-05-21T18:54:29.695+0000 7faed5caa700 0 >>>> log_channel(cluster) >>>> >> >> log [DBG] : pgmap v62: 97 pgs: 97 active+clean; 4.8 GiB data, >>>> 12 GiB >>>> >> >> used, 87 TiB / 87 TiB avail; 170 B/s wr, 0 op/s >>>> >> >> >> 2020-05-21T18:54:31.695+0000 7faed5caa700 0 >>>> log_channel(cluster) >>>> >> >> log [DBG] : pgmap v63: 97 pgs: 97 active+clean; 4.8 GiB data, >>>> 12 GiB >>>> >> >> used, 87 TiB / 87 TiB avail; 170 B/s wr, 0 op/s >>>> >> >> >> 2020-05-21T18:54:33.647+0000 7faee6b7d700 0 >>>> log_channel(cephadm) >>>> >> >> log [INF] : Upgrade: It is NOT safe to stop >>>> mon.vx-rg23-rk65-u43-130 >>>> >> >> >> 2020-05-21T18:54:33.695+0000 7faed5caa700 0 >>>> log_channel(cluster) >>>> >> >> log [DBG] : pgmap v64: 97 pgs: 97 active+clean; 4.8 GiB data, >>>> 12 GiB >>>> >> >> used, 87 TiB / 87 TiB avail; 170 B/s wr, 0 op/s >>>> >> >> >> 2020-05-21T18:54:35.699+0000 7faed5caa700 0 >>>> log_channel(cluster) >>>> >> >> log [DBG] : pgmap v65: 97 pgs: 97 active+clean; 4.8 GiB data, >>>> 12 GiB >>>> >> >> used, 87 TiB / 87 TiB avail; 170 B/s wr, 0 op/s >>>> >> >> >> 2020-05-21T18:54:37.699+0000 7faed5caa700 0 >>>> log_channel(cluster) >>>> >> >> log [DBG] : pgmap v66: 97 pgs: 97 active+clean; 4.8 GiB data, >>>> 12 GiB >>>> >> >> used, 87 TiB / 87 TiB avail >>>> >> >> >> 2020-05-21T18:54:39.699+0000 7faed5caa700 0 >>>> log_channel(cluster) >>>> >> >> log [DBG] : pgmap v67: 97 pgs: 97 active+clean; 4.8 GiB data, >>>> 12 GiB >>>> >> >> used, 87 TiB / 87 TiB avail >>>> >> >> >> 2020-05-21T18:54:41.699+0000 7faed5caa700 0 >>>> log_channel(cluster) >>>> >> >> log [DBG] : pgmap v68: 97 pgs: 97 active+clean; 4.8 GiB data, >>>> 12 GiB >>>> >> >> used, 87 TiB / 87 TiB avail >>>> >> >> >> 2020-05-21T18:54:43.703+0000 7faed5caa700 0 >>>> log_channel(cluster) >>>> >> >> log [DBG] : pgmap v69: 97 pgs: 97 active+clean; 4.8 GiB data, >>>> 12 GiB >>>> >> >> used, 87 TiB / 87 TiB avail >>>> >> >> >> 2020-05-21T18:54:45.703+0000 7faed5caa700 0 >>>> log_channel(cluster) >>>> >> >> log [DBG] : pgmap v70: 97 pgs: 97 active+clean; 4.8 GiB data, >>>> 12 GiB >>>> >> >> used, 87 TiB / 87 TiB avail >>>> >> >> >> 2020-05-21T18:54:47.703+0000 7faed5caa700 0 >>>> log_channel(cluster) >>>> >> >> log [DBG] : pgmap v71: 97 pgs: 97 active+clean; 4.8 GiB data, >>>> 12 GiB >>>> >> >> used, 87 TiB / 87 TiB avail >>>> >> >> >> 2020-05-21T18:54:48.663+0000 7faee6b7d700 0 >>>> log_channel(cephadm) >>>> >> >> log [INF] : Upgrade: It is NOT safe to stop >>>> mon.vx-rg23-rk65-u43-130 >>>> >> >> >> 2020-05-21T18:54:49.703+0000 7faed5caa700 0 >>>> log_channel(cluster) >>>> >> >> log [DBG] : pgmap v72: 97 pgs: 97 active+clean; 4.8 GiB data, >>>> 12 GiB >>>> >> >> used, 87 TiB / 87 TiB avail >>>> >> >> >> 2020-05-21T18:54:51.707+0000 7faed5caa700 0 >>>> log_channel(cluster) >>>> >> >> log [DBG] : pgmap v73: 97 pgs: 97 active+clean; 4.8 GiB data, >>>> 12 GiB >>>> >> >> used, 87 TiB / 87 TiB avail >>>> >> >> >> 2020-05-21T18:54:53.707+0000 7faed5caa700 0 >>>> log_channel(cluster) >>>> >> >> log [DBG] : pgmap v74: 97 pgs: 97 active+clean; 4.8 GiB data, >>>> 12 GiB >>>> >> >> used, 87 TiB / 87 TiB avail >>>> >> >> >> 2020-05-21T18:54:55.707+0000 7faed5caa700 0 >>>> log_channel(cluster) >>>> >> >> log [DBG] : pgmap v75: 97 pgs: 97 active+clean; 4.8 GiB data, >>>> 12 GiB >>>> >> >> used, 87 TiB / 87 TiB avail >>>> >> >> >> 2020-05-21T18:54:57.707+0000 7faed5caa700 0 >>>> log_channel(cluster) >>>> >> >> log [DBG] : pgmap v76: 97 pgs: 97 active+clean; 4.8 GiB data, >>>> 12 GiB >>>> >> >> used, 87 TiB / 87 TiB avail >>>> >> >> >> 2020-05-21T18:54:59.711+0000 7faed5caa700 0 >>>> log_channel(cluster) >>>> >> >> log [DBG] : pgmap v77: 97 pgs: 97 active+clean; 4.8 GiB data, >>>> 12 GiB >>>> >> >> used, 87 TiB / 87 TiB avail >>>> >> >> >> >>>> >> >> >> >>>> >> >> >> >>>> >> >> >> >>>> >> >> >> >>>> >> >> >> >>>> >> >> >> Sebastian Wagner wrote: >>>> >> >> >>> Hi Gencer, >>>> >> >> >>> >>>> >> >> >>> I'm going to need the full mgr log file. >>>> >> >> >>> >>>> >> >> >>> Best, >>>> >> >> >>> Sebastian >>>> >> >> >>> >>>> >> >> >>> Am 20.05.20 um 15:07 schrieb Gencer W. Genç: >>>> >> >> >>>> Ah yes, >>>> >> >> >>>> >>>> >> >> >>>> { >>>> >> >> >>>> "mon": { >>>> >> >> >>>> "ceph version 15.2.1 >>>> (9fd2f65f91d9246fae2c841a6222d34d121680ee) >>>> >> >> octopus >>>> >> >> >>>> (stable)": 2 >>>> >> >> >>>> }, >>>> >> >> >>>> "mgr": { >>>> >> >> >>>> "ceph version 15.2.2 >>>> (0c857e985a29d90501a285f242ea9c008df49eb8) >>>> >> >> octopus >>>> >> >> >>>> (stable)": 2 >>>> >> >> >>>> }, >>>> >> >> >>>> "osd": { >>>> >> >> >>>> "ceph version 15.2.1 >>>> (9fd2f65f91d9246fae2c841a6222d34d121680ee) >>>> >> >> octopus >>>> >> >> >>>> (stable)": 24 >>>> >> >> >>>> }, >>>> >> >> >>>> "mds": { >>>> >> >> >>>> "ceph version 15.2.1 >>>> (9fd2f65f91d9246fae2c841a6222d34d121680ee) >>>> >> >> octopus >>>> >> >> >>>> (stable)": 2 >>>> >> >> >>>> }, >>>> >> >> >>>> "overall": { >>>> >> >> >>>> "ceph version 15.2.1 >>>> (9fd2f65f91d9246fae2c841a6222d34d121680ee) >>>> >> >> octopus >>>> >> >> >>>> (stable)": 28, >>>> >> >> >>>> "ceph version 15.2.2 >>>> (0c857e985a29d90501a285f242ea9c008df49eb8) >>>> >> >> octopus >>>> >> >> >>>> (stable)": 2 >>>> >> >> >>>> } >>>> >> >> >>>> } >>>> >> >> >>>> >>>> >> >> >>>> How can i fix this? >>>> >> >> >>>> >>>> >> >> >>>> Gencer. >>>> >> >> >>>> On 20.05.2020 16:04:33, Ashley Merrick >>>> >> >> >>>> Does: >>>> >> >> >>>> >>>> >> >> >>>> >>>> >> >> >>>> ceph versions >>>> >> >> >>>> >>>> >> >> >>>> >>>> >> >> >>>> show any services yet running on 15.2.2? >>>> >> >> >>>> >>>> >> >> >>>> >>>> >> >> >>>> >>>> >> >> >>>> ---- On Wed, 20 May 2020 21:01:12 +0800 Gencer W. Genç >>>> >> >> >>>> ---- >>>> >> >> >>>> >>>> >> >> >>>> >>>> >> >> >>>> Hi Ashley, >>>> >> >> >>>> $ ceph orch upgrade status >>>> >> >> >>>> >>>> >> >> >>>> >>>> >> >> >>>> { >>>> >> >> >>>> >>>> >> >> >>>> "target_image": "docker.io/ceph/ceph:v15.2.2", >>>> >> >> >>>> >>>> >> >> >>>> "in_progress": true, >>>> >> >> >>>> >>>> >> >> >>>> "services_complete": [], >>>> >> >> >>>> >>>> >> >> >>>> "message": "" >>>> >> >> >>>> >>>> >> >> >>>> } >>>> >> >> >>>> >>>> >> >> >>>> >>>> >> >> >>>> Thanks, >>>> >> >> >>>> >>>> >> >> >>>> Gencer. >>>> >> >> >>>> >>>> >> >> >>>> >>>> >> >> >>>> On 20.05.2020 15:58:34, Ashley Merrick >>>> >> >> >>>> [mailto:singapore@amerrick.co.uk]> wrote: >>>> >> >> >>>> >>>> >> >> >>>> What does >>>> >> >> >>>> >>>> >> >> >>>> ceph orch upgrade status >>>> >> >> >>>> >>>> >> >> >>>> show? >>>> >> >> >>>> >>>> >> >> >>>> >>>> >> >> >>>> >>>> >> >> >>>> ---- On Wed, 20 May 2020 20:52:39 +0800 Gencer W. Genç >>>> >> >> >>>> [mailto:gencer@gencgiyen.com]> wrote ---- >>>> >> >> >>>> >>>> >> >> >>>> >>>> >> >> >>>> Hi, >>>> >> >> >>>> >>>> >> >> >>>> I've 15.2.1 installed on all machines. On primary machine I >>>> >> >> executed ceph upgrade >>>> >> >> >>>> command: >>>> >> >> >>>> >>>> >> >> >>>> $ ceph orch upgrade start --ceph-version 15.2.2 >>>> >> >> >>>> >>>> >> >> >>>> >>>> >> >> >>>> When I check ceph -s I see this: >>>> >> >> >>>> >>>> >> >> >>>> progress: >>>> >> >> >>>> Upgrade to docker.io/ceph/ceph:v15.2.2 (30m) >>>> >> >> >>>> [=...........................] (remaining: 8h) >>>> >> >> >>>> >>>> >> >> >>>> It says 8 hours. It is already ran for 3 hours. No upgrade >>>> >> >> processed. It get stuck at >>>> >> >> >>>> this point. >>>> >> >> >>>> >>>> >> >> >>>> Is there any way to know why this has stuck? >>>> >> >> >>>> >>>> >> >> >>>> Thanks, >>>> >> >> >>>> Gencer. >>>> >> >> >>>> _______________________________________________ >>>> >> >> >>>> ceph-users mailing list -- ceph-users(a)ceph.io >>>> >> >> [mailto:ceph-users@ceph.io] >>>> >> >> >>>> To unsubscribe send an email to ceph-users-leave(a)ceph.io >>>> >> >> >>>> [mailto:ceph-users-leave@ceph.io] >>>> >> >> >>>> >>>> >> >> >>>> >>>> >> >> >>>> >>>> >> >> >>>> >>>> >> >> >>>> _______________________________________________ >>>> >> >> >>>> ceph-users mailing list -- ceph-users(a)ceph.io >>>> >> >> >>>> To unsubscribe send an email to ceph-users-leave(a)ceph.io >>>> >> >> >>>> >>>> >> >> >> _______________________________________________ >>>> >> >> >> ceph-users mailing list -- ceph-users(a)ceph.io >>>> >> >> >> To unsubscribe send an email to ceph-users-leave(a)ceph.io >>>> >> >> >> >>>> >> >> > >>>> >> >> > -- >>>> >> >> > SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 >>>> Nürnberg, >>>> >> >> Germany >>>> >> >> > (HRB 36809, AG Nürnberg). Geschäftsführer: Felix Imendörffer >>>> >> >> > >>>> >> >> > _______________________________________________ >>>> >> >> > ceph-users mailing list -- ceph-users(a)ceph.io >>>> >> >> > To unsubscribe send an email to ceph-users-leave(a)ceph.io >>>> >> >> > >>>> >> >> >>>> >> >> -- >>>> >> >> SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 >>>> Nürnberg, >>>> >> >> Germany >>>> >> >> (HRB 36809, AG Nürnberg). Geschäftsführer: Felix Imendörffer >>>> >> >> >>>> >> >>>> >> -- >>>> >> SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg, >>>> >> Germany >>>> >> (HRB 36809, AG Nürnberg). Geschäftsführer: Felix Imendörffer >>>> >> >>>> >>>> -- >>>> SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg, >>>> Germany >>>> (HRB 36809, AG Nürnberg). Geschäftsführer: Felix Imendörffer >>>> -- SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany (HRB 36809, AG Nürnberg). Geschäftsführer: Felix Imendörffer

3 years, 10 months

1
0
0 0

Best way to change bucket hierarchy

by Kyriazis, George

Helo, I have a live ceph cluster, and I’m in the need of modifying the bucket hierarchy. I am currently using the default crush rule (ie. keep each replica on a different host). My need is to add a “chassis” level, and keep replicas on a per-chassis level. From what I read in the documentation, I would have to edit the crush file manually, however this sounds kinda scary for a live cluster. Are there any “best known methods” to achieve that goal without messing things up? In my current scenario, I have one host per chassis, and planning on later adding nodes where there would be >1 hosts per chassis. It looks like “in theory” there wouldn’t be a need for any data movement after the crush map changes. Will reality match theory? Anything else I need to watch out for? Thank you! George

3 years, 10 months

5
21
0 0

2024

2023

2022

2021

2020

2019

ceph-users June 2020