October 2020 - ceph-users

by Ml Ml

Hello, i played around with some log level i can´t remember and my logs are now getting bigger than my DVD-Movie collection. E.g.: journalctl -b -u ceph-5436dd5d-83d4-4dc8-a93b-60ab5db145df(a)mon.ceph03.service > out.file is 1,1GB big. I did already try: ceph tell mon.ceph03 config set debug_mon 0/10 ceph tell mon.ceph03 config set debug_osd 0/10 ceph tell mon.ceph03 config set debug_mgr 0/10 ceph tell mon.ceph03 config set "mon_health_to_clog" false ceph tell mon.ceph03 config set "mon_health_log_update_period" 30 ceph tell mon.ceph03 config set "debug_mgr" "0/0" which made it better, but i really cant remember it all and would like to have the default values. Is there a way to reset those Log Values? Cheers, Michael

3 years, 5 months

2
2
0 0

Monitor persistently out-of-quorum

by Ki Wong

Hello, I am at my wit's end. So I made a mistake in the configuration of my router and one of the monitors (out of 3) dropped out of the quorum and nothing I’ve done allow it to rejoin. That includes reinstalling the monitor with ceph-ansible. The connectivity issue is fixed. I’ve tested it using “nc” and the host can connect to both port 3300 and 6789 of the other monitors. But the wayward monitor continue to stay out of quorum. What is wrong? I see a bunch of “EBUSY” errors in the log, with the message: e1 handle_auth_request haven't formed initial quorum, EBUSY How do I fix this? Any help would be greatly appreciated. Many thanks, -kc With debug_mon at 1/10, I got these log snippets: 2020-10-28 15:40:05.961 7fb79253a700 4 mon.mgmt03@2(probing) e1 probe_timeout 0x564050353ec0 2020-10-28 15:40:05.961 7fb79253a700 10 mon.mgmt03@2(probing) e1 bootstrap 2020-10-28 15:40:05.961 7fb79253a700 10 mon.mgmt03@2(probing) e1 sync_reset_requester 2020-10-28 15:40:05.961 7fb79253a700 10 mon.mgmt03@2(probing) e1 unregister_cluster_logger - not registered 2020-10-28 15:40:05.961 7fb79253a700 10 mon.mgmt03@2(probing) e1 cancel_probe_timeout (none scheduled) 2020-10-28 15:40:05.961 7fb79253a700 10 mon.mgmt03@2(probing) e1 monmap e1: 3 mons at {mgmt01=[v2:10.0.1.1:3300/0,v1:10.0.1.1:6789/0],mgmt02=[v2:10.1.1.1:3300/0,v1:10.1.1.1:6789/0],mgmt03=[v2:10.2.1.1:3300/0,v1:10.2.1.1:6789/0]} 2020-10-28 15:40:05.961 7fb79253a700 10 mon.mgmt03@2(probing) e1 _reset 2020-10-28 15:40:05.961 7fb79253a700 10 mon.mgmt03(a)2(probing).auth v0 _set_mon_num_rank num 0 rank 0 2020-10-28 15:40:05.961 7fb79253a700 10 mon.mgmt03@2(probing) e1 cancel_probe_timeout (none scheduled) 2020-10-28 15:40:05.961 7fb79253a700 10 mon.mgmt03@2(probing) e1 timecheck_finish 2020-10-28 15:40:05.961 7fb79253a700 10 mon.mgmt03@2(probing) e1 scrub_event_cancel 2020-10-28 15:40:05.961 7fb79253a700 10 mon.mgmt03@2(probing) e1 scrub_reset 2020-10-28 15:40:05.961 7fb79253a700 10 mon.mgmt03@2(probing) e1 cancel_probe_timeout (none scheduled) 2020-10-28 15:40:05.961 7fb79253a700 10 mon.mgmt03@2(probing) e1 reset_probe_timeout 0x564050347ce0 after 2 seconds 2020-10-28 15:40:05.961 7fb79253a700 10 mon.mgmt03@2(probing) e1 probing other monitors 2020-10-28 15:40:07.961 7fb79253a700 4 mon.mgmt03@2(probing) e1 probe_timeout 0x564050347ce0 2020-10-28 15:40:07.961 7fb79253a700 10 mon.mgmt03@2(probing) e1 bootstrap 2020-10-28 15:40:07.961 7fb79253a700 10 mon.mgmt03@2(probing) e1 sync_reset_requester 2020-10-28 15:40:07.961 7fb79253a700 10 mon.mgmt03@2(probing) e1 unregister_cluster_logger - not registered 2020-10-28 15:40:07.961 7fb79253a700 10 mon.mgmt03@2(probing) e1 cancel_probe_timeout (none scheduled) 2020-10-28 15:40:07.961 7fb79253a700 10 mon.mgmt03@2(probing) e1 monmap e1: 3 mons at {mgmt01=[v2:10.0.1.1:3300/0,v1:10.0.1.1:6789/0],mgmt02=[v2:10.1.1.1:3300/0,v1:10.1.1.1:6789/0],mgmt03=[v2:10.2.1.1:3300/0,v1:10.2.1.1:6789/0]} 2020-10-28 15:40:07.961 7fb79253a700 10 mon.mgmt03@2(probing) e1 _reset 2020-10-28 15:40:07.961 7fb79253a700 10 mon.mgmt03(a)2(probing).auth v0 _set_mon_num_rank num 0 rank 0 2020-10-28 15:40:07.961 7fb79253a700 10 mon.mgmt03@2(probing) e1 cancel_probe_timeout (none scheduled) 2020-10-28 15:40:07.961 7fb79253a700 10 mon.mgmt03@2(probing) e1 timecheck_finish 2020-10-28 15:40:07.961 7fb79253a700 10 mon.mgmt03@2(probing) e1 scrub_event_cancel 2020-10-28 15:40:07.961 7fb79253a700 10 mon.mgmt03@2(probing) e1 scrub_reset 2020-10-28 15:40:07.961 7fb79253a700 10 mon.mgmt03@2(probing) e1 cancel_probe_timeout (none scheduled) 2020-10-28 15:40:07.961 7fb79253a700 10 mon.mgmt03@2(probing) e1 reset_probe_timeout 0x564050360660 after 2 seconds 2020-10-28 15:40:07.961 7fb79253a700 10 mon.mgmt03@2(probing) e1 probing other monitors 2020-10-28 15:40:09.107 7fb79253a700 -1 mon.mgmt03@2(probing) e1 get_health_metrics reporting 7 slow ops, oldest is log(1 entries from seq 1 at 2020-10-27 23:03:41.586915) 2020-10-28 15:40:09.961 7fb79253a700 4 mon.mgmt03@2(probing) e1 probe_timeout 0x564050360660 2020-10-28 15:40:09.961 7fb79253a700 10 mon.mgmt03@2(probing) e1 bootstrap 2020-10-28 15:40:09.961 7fb79253a700 10 mon.mgmt03@2(probing) e1 sync_reset_requester 2020-10-28 15:40:09.961 7fb79253a700 10 mon.mgmt03@2(probing) e1 unregister_cluster_logger - not registered 2020-10-28 15:40:09.961 7fb79253a700 10 mon.mgmt03@2(probing) e1 cancel_probe_timeout (none scheduled) 2020-10-28 15:40:09.961 7fb79253a700 10 mon.mgmt03@2(probing) e1 monmap e1: 3 mons at {mgmt01=[v2:10.0.1.1:3300/0,v1:10.0.1.1:6789/0],mgmt02=[v2:10.1.1.1:3300/0,v1:10.1.1.1:6789/0],mgmt03=[v2:10.2.1.1:3300/0,v1:10.2.1.1:6789/0]} 2020-10-28 15:40:09.961 7fb79253a700 10 mon.mgmt03@2(probing) e1 _reset 2020-10-28 15:40:09.961 7fb79253a700 10 mon.mgmt03(a)2(probing).auth v0 _set_mon_num_rank num 0 rank 0 2020-10-28 15:40:09.961 7fb79253a700 10 mon.mgmt03@2(probing) e1 cancel_probe_timeout (none scheduled) 2020-10-28 15:40:09.961 7fb79253a700 10 mon.mgmt03@2(probing) e1 timecheck_finish 2020-10-28 15:40:09.961 7fb79253a700 10 mon.mgmt03@2(probing) e1 scrub_event_cancel 2020-10-28 15:40:09.961 7fb79253a700 10 mon.mgmt03@2(probing) e1 scrub_reset 2020-10-28 15:40:09.961 7fb79253a700 10 mon.mgmt03@2(probing) e1 cancel_probe_timeout (none scheduled) 2020-10-28 15:40:09.961 7fb79253a700 10 mon.mgmt03@2(probing) e1 reset_probe_timeout 0x5640503606c0 after 2 seconds 2020-10-28 15:40:09.961 7fb79253a700 10 mon.mgmt03@2(probing) e1 probing other monitors 2020-10-28 15:40:11.962 7fb79253a700 4 mon.mgmt03@2(probing) e1 probe_timeout 0x5640503606c0 2020-10-28 15:40:11.962 7fb79253a700 10 mon.mgmt03@2(probing) e1 bootstrap 2020-10-28 15:40:11.962 7fb79253a700 10 mon.mgmt03@2(probing) e1 sync_reset_requester 2020-10-28 15:40:11.962 7fb79253a700 10 mon.mgmt03@2(probing) e1 unregister_cluster_logger - not registered 2020-10-28 15:40:11.962 7fb79253a700 10 mon.mgmt03@2(probing) e1 cancel_probe_timeout (none scheduled) 2020-10-28 15:40:11.962 7fb79253a700 10 mon.mgmt03@2(probing) e1 monmap e1: 3 mons at {mgmt01=[v2:10.0.1.1:3300/0,v1:10.0.1.1:6789/0],mgmt02=[v2:10.1.1.1:3300/0,v1:10.1.1.1:6789/0],mgmt03=[v2:10.2.1.1:3300/0,v1:10.2.1.1:6789/0]} 2020-10-28 15:40:11.962 7fb79253a700 10 mon.mgmt03@2(probing) e1 _reset 2020-10-28 15:40:11.962 7fb79253a700 10 mon.mgmt03(a)2(probing).auth v0 _set_mon_num_rank num 0 rank 0 2020-10-28 15:40:11.962 7fb79253a700 10 mon.mgmt03@2(probing) e1 cancel_probe_timeout (none scheduled) 2020-10-28 15:40:11.962 7fb79253a700 10 mon.mgmt03@2(probing) e1 timecheck_finish 2020-10-28 15:40:11.962 7fb79253a700 10 mon.mgmt03@2(probing) e1 scrub_event_cancel 2020-10-28 15:40:11.962 7fb79253a700 10 mon.mgmt03@2(probing) e1 scrub_reset 2020-10-28 15:40:11.962 7fb79253a700 10 mon.mgmt03@2(probing) e1 cancel_probe_timeout (none scheduled) 2020-10-28 15:40:11.962 7fb79253a700 10 mon.mgmt03@2(probing) e1 reset_probe_timeout 0x564050360900 after 2 seconds 2020-10-28 15:40:11.962 7fb79253a700 10 mon.mgmt03@2(probing) e1 probing other monitors 2020-10-28 15:40:12.354 7fb79453e700 10 mon.mgmt03@2(probing) e1 handle_auth_request con 0x56404cf25400 (start) method 2 payload 32 2020-10-28 15:40:12.354 7fb79453e700 10 mon.mgmt03@2(probing) e1 handle_auth_request haven't formed initial quorum, EBUSY 2020-10-28 15:40:12.354 7fb78fd35700 10 mon.mgmt03@2(probing) e1 ms_handle_reset 0x56404cf25400 - … 2020-10-28 15:40:59.968 7fb79253a700 10 mon.mgmt03@2(probing) e1 probing other monitors 2020-10-28 15:41:00.110 7fb79453e700 10 mon.mgmt03@2(probing) e1 handle_auth_request con 0x56404cae1000 (start) method 2 payload 22 2020-10-28 15:41:00.110 7fb79453e700 10 mon.mgmt03@2(probing) e1 handle_auth_request haven't formed initial quorum, EBUSY 2020-10-28 15:41:00.110 7fb78fd35700 10 mon.mgmt03@2(probing) e1 ms_handle_reset 0x56404cae1000 - 2020-10-28 15:41:00.110 7fb793d3d700 10 mon.mgmt03@2(probing) e1 handle_auth_request con 0x56404c8d4c00 (start) method 2 payload 22 2020-10-28 15:41:00.110 7fb793d3d700 10 mon.mgmt03@2(probing) e1 handle_auth_request haven't formed initial quorum, EBUSY 2020-10-28 15:41:00.110 7fb78fd35700 10 mon.mgmt03@2(probing) e1 ms_handle_reset 0x56404c8d4c00 - 2020-10-28 15:41:00.117 7fb79453e700 10 mon.mgmt03@2(probing) e1 handle_auth_request con 0x56404c630800 (start) method 2 payload 22 2020-10-28 15:41:00.117 7fb79453e700 10 mon.mgmt03@2(probing) e1 handle_auth_request haven't formed initial quorum, EBUSY 2020-10-28 15:41:00.117 7fb78fd35700 10 mon.mgmt03@2(probing) e1 ms_handle_reset 0x56404c630800 - … 2020-10-28 15:42:42.379 7fb78d530700 4 rocksdb: [db/db_impl.cc:777] ------- DUMPING STATS ------- 2020-10-28 15:42:42.379 7fb78d530700 4 rocksdb: [db/db_impl.cc:778] ** DB Stats ** Uptime(secs): 60000.0 total, 600.0 interval Cumulative writes: 6 writes, 7 keys, 6 commit groups, 0.9 writes per commit group, ingest: 0.00 GB, 0.00 MB/s Cumulative WAL: 6 writes, 6 syncs, 0.86 writes per sync, written: 0.00 GB, 0.00 MB/s Cumulative stall: 00:00:0.000 H:M:S, 0.0 percent Interval writes: 0 writes, 0 keys, 0 commit groups, 0.0 writes per commit group, ingest: 0.00 MB, 0.00 MB/s Interval WAL: 0 writes, 0 syncs, 0.00 writes per sync, written: 0.00 MB, 0.00 MB/s Interval stall: 00:00:0.000 H:M:S, 0.0 percent ** Compaction Stats [default] ** Level Files Size Score Read(GB) Rn(GB) Rnp1(GB) Write(GB) Wnew(GB) Moved(GB) W-Amp Rd(MB/s) Wr(MB/s) Comp(sec) CompMergeCPU(sec) Comp(cnt) Avg(sec) KeyIn KeyDrop ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------- L0 2/0 3.02 KB 0.5 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 1.7 0.00 0.00 1 0.001 0 0 Sum 2/0 3.02 KB 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 1.7 0.00 0.00 1 0.001 0 0 Int 0/0 0.00 KB 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.00 0.00 0 0.000 0 0 ** Compaction Stats [default] ** Priority Files Size Score Read(GB) Rn(GB) Rnp1(GB) Write(GB) Wnew(GB) Moved(GB) W-Amp Rd(MB/s) Wr(MB/s) Comp(sec) CompMergeCPU(sec) Comp(cnt) Avg(sec) KeyIn KeyDrop ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- User 0/0 0.00 KB 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.7 0.00 0.00 1 0.001 0 0 Uptime(secs): 60000.0 total, 600.0 interval Flush(GB): cumulative 0.000, interval 0.000 AddFile(GB): cumulative 0.000, interval 0.000 AddFile(Total Files): cumulative 0, interval 0 AddFile(L0 Files): cumulative 0, interval 0 AddFile(Keys): cumulative 0, interval 0 Cumulative compaction: 0.00 GB write, 0.00 MB/s write, 0.00 GB read, 0.00 MB/s read, 0.0 seconds Interval compaction: 0.00 GB write, 0.00 MB/s write, 0.00 GB read, 0.00 MB/s read, 0.0 seconds Stalls(count): 0 level0_slowdown, 0 level0_slowdown_with_compaction, 0 level0_numfiles, 0 level0_numfiles_with_compaction, 0 stop for pending_compaction_bytes, 0 slowdown for pending_compaction_bytes, 0 memtable_compaction, 0 memtable_slowdown, interval 0 total count … 2020-10-28 17:17:11.781 7f5f694821c0 0 using public_addr v2:10.2.1.1:0/0 -> [v2:10.2.1.1:3300/0,v1:10.2.1.1:6789/0] 2020-10-28 17:17:11.781 7f5f694821c0 0 starting mon.mgmt03 rank -1 at public addrs [v2:10.2.1.1:3300/0,v1:10.2.1.1:6789/0] at bind addrs [v2:10.2.1.1:3300/0,v1:10.2.1.1:6789/0] mon_data /var/lib/ceph/mon/ceph-mgmt03 fsid 374aed9e-5fc1-47e1-8d29-4416f7425e76 2020-10-28 17:17:11.783 7f5f694821c0 1 mon.mgmt03@-1(???) e2 preinit fsid 374aed9e-5fc1-47e1-8d29-4416f7425e76 2020-10-28 17:17:11.783 7f5f694821c0 1 mon.mgmt03@-1(???) e2 initial_members mgmt01,mgmt02,mgmt03, filtering seed monmap 2020-10-28 17:17:11.783 7f5f694821c0 1 mon.mgmt03@-1(???) e2 preinit clean up potentially inconsistent store state 2020-10-28 17:17:11.785 7f5f694821c0 0 -- [v2:10.2.1.1:3300/0,v1:10.2.1.1:6789/0] send_to message mon_probe(probe 374aed9e-5fc1-47e1-8d29-4416f7425e76 name mgmt03 new mon_release 14) v7 with empty dest 2020-10-28 17:17:13.191 7f5f5170d700 0 mon.mgmt03@-1(probing) e3 monmap addrs for rank 2 changed, i am [v2:10.2.1.1:3300/0,v1:10.2.1.1:6789/0], monmap is v2:10.2.1.1:3300/0, respawning 2020-10-28 17:17:13.191 7f5f5170d700 -1 mon.mgmt03@-1(probing) e3 stashing newest monmap 3 for next startup 2020-10-28 17:17:13.191 7f5f5170d700 0 mon.mgmt03@-1(probing) e3 respawn 2020-10-28 17:17:13.191 7f5f5170d700 1 mon.mgmt03@-1(probing) e3 e: '/usr/bin/ceph-mon' 2020-10-28 17:17:13.191 7f5f5170d700 1 mon.mgmt03@-1(probing) e3 0: '/usr/bin/ceph-mon' 2020-10-28 17:17:13.191 7f5f5170d700 1 mon.mgmt03@-1(probing) e3 1: '-f' 2020-10-28 17:17:13.191 7f5f5170d700 1 mon.mgmt03@-1(probing) e3 2: '--cluster' 2020-10-28 17:17:13.191 7f5f5170d700 1 mon.mgmt03@-1(probing) e3 3: 'ceph' 2020-10-28 17:17:13.191 7f5f5170d700 1 mon.mgmt03@-1(probing) e3 4: '--id' 2020-10-28 17:17:13.191 7f5f5170d700 1 mon.mgmt03@-1(probing) e3 5: 'mgmt03' 2020-10-28 17:17:13.191 7f5f5170d700 1 mon.mgmt03@-1(probing) e3 6: '--setuser' 2020-10-28 17:17:13.191 7f5f5170d700 1 mon.mgmt03@-1(probing) e3 7: 'ceph' 2020-10-28 17:17:13.192 7f5f5170d700 1 mon.mgmt03@-1(probing) e3 8: '--setgroup' 2020-10-28 17:17:13.192 7f5f5170d700 1 mon.mgmt03@-1(probing) e3 9: 'ceph' 2020-10-28 17:17:13.192 7f5f5170d700 1 mon.mgmt03@-1(probing) e3 respawning with exe /usr/bin/ceph-mon 2020-10-28 17:17:13.192 7f5f5170d700 1 mon.mgmt03@-1(probing) e3 exe_path /proc/self/exe 2020-10-28 17:17:13.217 7eff1f7cd1c0 0 ceph version 14.2.11 (f7fdb2f52131f54b891a2ec99d8205561242cdaf) nautilus (stable), process ceph-mon, pid 24265 2020-10-28 17:17:13.217 7eff1f7cd1c0 0 pidfile_write: ignore empty --pid-file 2020-10-28 17:17:13.246 7eff1f7cd1c0 0 load: jerasure load: lrc load: isa 2020-10-28 17:17:13.246 7eff1f7cd1c0 0 set rocksdb option compression = kNoCompression 2020-10-28 17:17:13.246 7eff1f7cd1c0 0 set rocksdb option level_compaction_dynamic_level_bytes = true 2020-10-28 17:17:13.246 7eff1f7cd1c0 0 set rocksdb option write_buffer_size = 33554432 2020-10-28 17:17:13.246 7eff1f7cd1c0 0 set rocksdb option compression = kNoCompression 2020-10-28 17:17:13.246 7eff1f7cd1c0 0 set rocksdb option level_compaction_dynamic_level_bytes = true 2020-10-28 17:17:13.246 7eff1f7cd1c0 0 set rocksdb option write_buffer_size = 33554432 2020-10-28 17:17:13.246 7eff1f7cd1c0 1 rocksdb: do_open column families: [default] 2020-10-28 17:17:13.246 7eff1f7cd1c0 4 rocksdb: RocksDB version: 6.1.2 2020-10-28 17:17:13.246 7eff1f7cd1c0 4 rocksdb: Git sha rocksdb_build_git_sha:@0@ 2020-10-28 17:17:13.246 7eff1f7cd1c0 4 rocksdb: Compile date Aug 13 2020 2020-10-28 17:17:13.246 7eff1f7cd1c0 4 rocksdb: DB SUMMARY 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: CURRENT file: CURRENT 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: IDENTITY file: IDENTITY 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: MANIFEST file: MANIFEST-009877 size: 194 Bytes 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: SST files in /var/lib/ceph/mon/ceph-mgmt03/store.db dir, Total Num: 2, files: 009874.sst 009876.sst 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Write Ahead Log file in /var/lib/ceph/mon/ceph-mgmt03/store.db: 009878.log size: 451 ; 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.error_if_exists: 0 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.create_if_missing: 0 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.paranoid_checks: 1 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.env: 0x55beb27a5780 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.info_log: 0x55beb3d8e300 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.max_file_opening_threads: 16 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.statistics: (nil) 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.use_fsync: 0 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.max_log_file_size: 0 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.max_manifest_file_size: 1073741824 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.log_file_time_to_roll: 0 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.keep_log_file_num: 1000 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.recycle_log_file_num: 0 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.allow_fallocate: 1 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.allow_mmap_reads: 0 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.allow_mmap_writes: 0 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.use_direct_reads: 0 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.use_direct_io_for_flush_and_compaction: 0 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.create_missing_column_families: 0 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.db_log_dir: 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.wal_dir: /var/lib/ceph/mon/ceph-mgmt03/store.db 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.table_cache_numshardbits: 6 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.max_subcompactions: 1 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.max_background_flushes: -1 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.WAL_ttl_seconds: 0 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.WAL_size_limit_MB: 0 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.manifest_preallocation_size: 4194304 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.is_fd_close_on_exec: 1 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.advise_random_on_open: 1 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.db_write_buffer_size: 0 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.write_buffer_manager: 0x55beb3d98b40 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.access_hint_on_compaction_start: 1 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.new_table_reader_for_compaction_inputs: 0 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.random_access_max_buffer_size: 1048576 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.use_adaptive_mutex: 0 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.rate_limiter: (nil) 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.sst_file_manager.rate_bytes_per_sec: 0 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.wal_recovery_mode: 2 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.enable_thread_tracking: 0 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.enable_pipelined_write: 0 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.allow_concurrent_memtable_write: 1 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.enable_write_thread_adaptive_yield: 1 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.write_thread_max_yield_usec: 100 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.write_thread_slow_yield_usec: 3 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.row_cache: None 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.wal_filter: None 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.avoid_flush_during_recovery: 0 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.allow_ingest_behind: 0 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.preserve_deletes: 0 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.two_write_queues: 0 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.manual_wal_flush: 0 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.atomic_flush: 0 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.avoid_unnecessary_blocking_io: 0 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.max_background_jobs: 2 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.max_background_compactions: -1 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.avoid_flush_during_shutdown: 0 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.writable_file_max_buffer_size: 1048576 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.delayed_write_rate : 16777216 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.max_total_wal_size: 0 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.delete_obsolete_files_period_micros: 21600000000 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.stats_dump_period_sec: 600 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.stats_persist_period_sec: 600 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.stats_history_buffer_size: 1048576 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.max_open_files: -1 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.bytes_per_sync: 0 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.wal_bytes_per_sync: 0 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.compaction_readahead_size: 0 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Compression algorithms supported: 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: kZSTDNotFinalCompression supported: 0 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: kZSTD supported: 0 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: kXpressCompression supported: 0 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: kLZ4HCCompression supported: 1 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: kLZ4Compression supported: 1 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: kBZip2Compression supported: 0 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: kZlibCompression supported: 1 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: kSnappyCompression supported: 1 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Fast CRC32 supported: Supported on x86 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: [db/version_set.cc:3543] Recovering from manifest file: /var/lib/ceph/mon/ceph-mgmt03/store.db/MANIFEST-009877 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: [db/column_family.cc:477] --------------- Options for column family [default]: 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.comparator: leveldb.BytewiseComparator 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.merge_operator: 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.compaction_filter: None 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.compaction_filter_factory: None 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.memtable_factory: SkipListFactory 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.table_factory: BlockBasedTable 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: table_factory options: flush_block_policy_factory: FlushBlockBySizePolicyFactory (0x55beb30daac0) cache_index_and_filter_blocks: 1 cache_index_and_filter_blocks_with_high_priority: 1 pin_l0_filter_and_index_blocks_in_cache: 0 pin_top_level_index_and_filter: 1 index_type: 0 data_block_index_type: 0 data_block_hash_table_util_ratio: 0.750000 hash_index_allow_collision: 1 checksum: 1 no_block_cache: 0 block_cache: 0x55beb30f7010 block_cache_name: BinnedLRUCache block_cache_options: capacity : 536870912 num_shard_bits : 4 strict_capacity_limit : 0 high_pri_pool_ratio: 0.000 block_cache_compressed: (nil) persistent_cache: (nil) block_size: 4096 block_size_deviation: 10 block_restart_interval: 16 index_block_restart_interval: 1 metadata_block_size: 4096 partition_filters: 0 use_delta_encoding: 1 filter_policy: rocksdb.BuiltinBloomFilter whole_key_filtering: 1 verify_compression: 0 read_amp_bytes_per_bit: 0 format_version: 2 enable_index_compression: 1 block_align: 0 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.write_buffer_size: 33554432 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.max_write_buffer_number: 2 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.compression: NoCompression 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.bottommost_compression: Disabled 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.prefix_extractor: nullptr 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.memtable_insert_with_hint_prefix_extractor: nullptr 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.num_levels: 7 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.min_write_buffer_number_to_merge: 1 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.max_write_buffer_number_to_maintain: 0 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.bottommost_compression_opts.window_bits: -14 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.bottommost_compression_opts.level: 32767 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.bottommost_compression_opts.strategy: 0 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.bottommost_compression_opts.max_dict_bytes: 0 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.bottommost_compression_opts.zstd_max_train_bytes: 0 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.bottommost_compression_opts.enabled: false 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.compression_opts.window_bits: -14 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.compression_opts.level: 32767 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.compression_opts.strategy: 0 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.compression_opts.max_dict_bytes: 0 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.compression_opts.zstd_max_train_bytes: 0 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.compression_opts.enabled: false 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.level0_file_num_compaction_trigger: 4 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.level0_slowdown_writes_trigger: 20 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.level0_stop_writes_trigger: 36 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.target_file_size_base: 67108864 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.target_file_size_multiplier: 1 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.max_bytes_for_level_base: 268435456 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.level_compaction_dynamic_level_bytes: 1 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.max_bytes_for_level_multiplier: 10.000000 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.max_bytes_for_level_multiplier_addtl[0]: 1 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.max_bytes_for_level_multiplier_addtl[1]: 1 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.max_bytes_for_level_multiplier_addtl[2]: 1 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.max_bytes_for_level_multiplier_addtl[3]: 1 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.max_bytes_for_level_multiplier_addtl[4]: 1 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.max_bytes_for_level_multiplier_addtl[5]: 1 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.max_bytes_for_level_multiplier_addtl[6]: 1 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.max_sequential_skip_in_iterations: 8 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.max_compaction_bytes: 1677721600 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.arena_block_size: 4194304 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.soft_pending_compaction_bytes_limit: 68719476736 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.hard_pending_compaction_bytes_limit: 274877906944 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.rate_limit_delay_max_milliseconds: 100 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.disable_auto_compactions: 0 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.compaction_style: kCompactionStyleLevel 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.compaction_pri: kMinOverlappingRatio 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.compaction_options_universal.size_ratio: 1 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.compaction_options_universal.min_merge_width: 2 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.compaction_options_universal.max_merge_width: 4294967295 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.compaction_options_universal.max_size_amplification_percent: 200 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.compaction_options_universal.compression_size_percent: -1 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.compaction_options_universal.stop_style: kCompactionStopStyleTotalSize 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.compaction_options_fifo.max_table_files_size: 1073741824 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.compaction_options_fifo.allow_compaction: 0 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.table_properties_collectors: 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.inplace_update_support: 0 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.inplace_update_num_locks: 10000 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.memtable_prefix_bloom_size_ratio: 0.000000 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.memtable_whole_key_filtering: 0 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.memtable_huge_page_size: 0 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.bloom_locality: 0 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.max_successive_merges: 0 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.optimize_filters_for_hits: 0 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.paranoid_file_checks: 0 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.force_consistency_checks: 0 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.report_bg_io_stats: 0 2020-10-28 17:17:13.247 7eff1f7cd1c0 4 rocksdb: Options.ttl: 0 2020-10-28 17:17:13.248 7eff1f7cd1c0 3 rocksdb: [db/version_set.cc:2581] More existing levels in DB than needed. max_bytes_for_level_multiplier may not be guaranteed. 2020-10-28 17:17:13.248 7eff1f7cd1c0 4 rocksdb: [db/version_set.cc:3757] Recovered from manifest file:/var/lib/ceph/mon/ceph-mgmt03/store.db/MANIFEST-009877 succeeded,manifest_file_number is 9877, next_file_number is 9879, last_sequence is 2479, log_number is 9872,prev_log_number is 0,max_column_family is 0,min_log_number_to_keep is 0 2020-10-28 17:17:13.248 7eff1f7cd1c0 4 rocksdb: [db/version_set.cc:3766] Column family [default] (ID 0), log number is 9872 2020-10-28 17:17:13.248 7eff1f7cd1c0 4 rocksdb: EVENT_LOG_v1 {"time_micros": 1603930633249084, "job": 1, "event": "recovery_started", "log_files": [9878]} 2020-10-28 17:17:13.248 7eff1f7cd1c0 4 rocksdb: [db/db_impl_open.cc:583] Recovering log #9878 mode 2 2020-10-28 17:17:13.249 7eff1f7cd1c0 4 rocksdb: EVENT_LOG_v1 {"time_micros": 1603930633250074, "cf_name": "default", "job": 1, "event": "table_file_creation", "file_number": 9879, "file_size": 1336, "table_properties": {"data_size": 434, "index_size": 28, "filter_size": 69, "raw_key_size": 34, "raw_average_key_size": 34, "raw_value_size": 383, "raw_average_value_size": 383, "num_data_blocks": 1, "num_entries": 1, "filter_policy_name": "rocksdb.BuiltinBloomFilter"}} 2020-10-28 17:17:13.249 7eff1f7cd1c0 4 rocksdb: [db/version_set.cc:3036] Creating manifest 9880 2020-10-28 17:17:13.249 7eff1f7cd1c0 3 rocksdb: [db/version_set.cc:2581] More existing levels in DB than needed. max_bytes_for_level_multiplier may not be guaranteed. 2020-10-28 17:17:13.250 7eff1f7cd1c0 4 rocksdb: EVENT_LOG_v1 {"time_micros": 1603930633251509, "job": 1, "event": "recovery_finished"} 2020-10-28 17:17:13.253 7eff1f7cd1c0 4 rocksdb: DB pointer 0x55beb3d26400 2020-10-28 17:17:13.253 7eff05253700 4 rocksdb: [db/db_impl.cc:777] ------- DUMPING STATS ------- 2020-10-28 17:17:13.253 7eff05253700 4 rocksdb: [db/db_impl.cc:778] ** DB Stats ** Uptime(secs): 0.0 total, 0.0 interval Cumulative writes: 0 writes, 0 keys, 0 commit groups, 0.0 writes per commit group, ingest: 0.00 GB, 0.00 MB/s Cumulative WAL: 0 writes, 0 syncs, 0.00 writes per sync, written: 0.00 GB, 0.00 MB/s Cumulative stall: 00:00:0.000 H:M:S, 0.0 percent Interval writes: 0 writes, 0 keys, 0 commit groups, 0.0 writes per commit group, ingest: 0.00 MB, 0.00 MB/s Interval WAL: 0 writes, 0 syncs, 0.00 writes per sync, written: 0.00 MB, 0.00 MB/s Interval stall: 00:00:0.000 H:M:S, 0.0 percent ** Compaction Stats [default] ** Level Files Size Score Read(GB) Rn(GB) Rnp1(GB) Write(GB) Wnew(GB) Moved(GB) W-Amp Rd(MB/s) Wr(MB/s) Comp(sec) CompMergeCPU(sec) Comp(cnt) Avg(sec) KeyIn KeyDrop ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------- L0 2/0 2.61 KB 0.5 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 1.4 0.00 0.00 1 0.001 0 0 L6 1/0 2.39 KB 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.00 0.00 0 0.000 0 0 Sum 3/0 5.00 KB 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 1.4 0.00 0.00 1 0.001 0 0 Int 0/0 0.00 KB 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 1.4 0.00 0.00 1 0.001 0 0 ** Compaction Stats [default] ** Priority Files Size Score Read(GB) Rn(GB) Rnp1(GB) Write(GB) Wnew(GB) Moved(GB) W-Amp Rd(MB/s) Wr(MB/s) Comp(sec) CompMergeCPU(sec) Comp(cnt) Avg(sec) KeyIn KeyDrop ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- User 0/0 0.00 KB 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.4 0.00 0.00 1 0.001 0 0 Uptime(secs): 0.0 total, 0.0 interval Flush(GB): cumulative 0.000, interval 0.000 AddFile(GB): cumulative 0.000, interval 0.000 AddFile(Total Files): cumulative 0, interval 0 AddFile(L0 Files): cumulative 0, interval 0 AddFile(Keys): cumulative 0, interval 0 Cumulative compaction: 0.00 GB write, 0.22 MB/s write, 0.00 GB read, 0.00 MB/s read, 0.0 seconds Interval compaction: 0.00 GB write, 0.22 MB/s write, 0.00 GB read, 0.00 MB/s read, 0.0 seconds Stalls(count): 0 level0_slowdown, 0 level0_slowdown_with_compaction, 0 level0_numfiles, 0 level0_numfiles_with_compaction, 0 stop for pending_compaction_bytes, 0 slowdown for pending_compaction_bytes, 0 memtable_compaction, 0 memtable_slowdown, interval 0 total count ** File Read Latency Histogram By Level [default] ** ** Compaction Stats [default] ** Level Files Size Score Read(GB) Rn(GB) Rnp1(GB) Write(GB) Wnew(GB) Moved(GB) W-Amp Rd(MB/s) Wr(MB/s) Comp(sec) CompMergeCPU(sec) Comp(cnt) Avg(sec) KeyIn KeyDrop ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------- L0 2/0 2.61 KB 0.5 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 1.4 0.00 0.00 1 0.001 0 0 L6 1/0 2.39 KB 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.00 0.00 0 0.000 0 0 Sum 3/0 5.00 KB 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 1.4 0.00 0.00 1 0.001 0 0 Int 0/0 0.00 KB 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.00 0.00 0 0.000 0 0 ** Compaction Stats [default] ** Priority Files Size Score Read(GB) Rn(GB) Rnp1(GB) Write(GB) Wnew(GB) Moved(GB) W-Amp Rd(MB/s) Wr(MB/s) Comp(sec) CompMergeCPU(sec) Comp(cnt) Avg(sec) KeyIn KeyDrop ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- User 0/0 0.00 KB 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.4 0.00 0.00 1 0.001 0 0 Uptime(secs): 0.0 total, 0.0 interval Flush(GB): cumulative 0.000, interval 0.000 AddFile(GB): cumulative 0.000, interval 0.000 AddFile(Total Files): cumulative 0, interval 0 AddFile(L0 Files): cumulative 0, interval 0 AddFile(Keys): cumulative 0, interval 0 Cumulative compaction: 0.00 GB write, 0.21 MB/s write, 0.00 GB read, 0.00 MB/s read, 0.0 seconds Interval compaction: 0.00 GB write, 0.00 MB/s write, 0.00 GB read, 0.00 MB/s read, 0.0 seconds Stalls(count): 0 level0_slowdown, 0 level0_slowdown_with_compaction, 0 level0_numfiles, 0 level0_numfiles_with_compaction, 0 stop for pending_compaction_bytes, 0 slowdown for pending_compaction_bytes, 0 memtable_compaction, 0 memtable_slowdown, interval 0 total count ** File Read Latency Histogram By Level [default] ** 2020-10-28 17:17:13.253 7eff1f7cd1c0 0 mon.mgmt03 does not exist in monmap, will attempt to join an existing cluster 2020-10-28 17:17:13.254 7eff1f7cd1c0 0 using public_addr v2:10.2.1.1:0/0 -> [v2:10.2.1.1:3300/0,v1:10.2.1.1:6789/0] 2020-10-28 17:17:13.254 7eff1f7cd1c0 0 starting mon.mgmt03 rank -1 at public addrs [v2:10.2.1.1:3300/0,v1:10.2.1.1:6789/0] at bind addrs [v2:10.2.1.1:3300/0,v1:10.2.1.1:6789/0] mon_data /var/lib/ceph/mon/ceph-mgmt03 fsid 374aed9e-5fc1-47e1-8d29-4416f7425e76 2020-10-28 17:17:13.256 7eff1f7cd1c0 1 mon.mgmt03@-1(???) e2 preinit fsid 374aed9e-5fc1-47e1-8d29-4416f7425e76 2020-10-28 17:17:13.256 7eff1f7cd1c0 1 mon.mgmt03@-1(???) e2 initial_members mgmt01,mgmt02,mgmt03, filtering seed monmap 2020-10-28 17:17:13.256 7eff1f7cd1c0 1 mon.mgmt03@-1(???) e2 preinit clean up potentially inconsistent store state 2020-10-28 17:17:13.258 7eff1f7cd1c0 0 -- [v2:10.2.1.1:3300/0,v1:10.2.1.1:6789/0] send_to message mon_probe(probe 374aed9e-5fc1-47e1-8d29-4416f7425e76 name mgmt03 new mon_release 14) v7 with empty dest

3 years, 5 months

3
4
0 0

pgs stuck backfill_toofull

by Mark Johnson

I've been struggling with this one for a few days now. We had an OSD report as near full a few days ago. Had this happen a couple of times before and a reweight-by-utilization has sorted it out in the past. Tried the same again but this time we ended up with a couple of pgs in a state of backfill_toofull and a handful of misplaced objects as a result. Tried doing the reweight a few more times and it's been moving data around. We did have another osd trigger the near full alert but running the reweight a couple more times seems to have moved some of that data around a bit better. However, the original near_full osd doesn't seem to have changed much and the backfill_toofull pgs are still there. I'd keep doing the reweight-by-utilization but I'm not sure if I'm heading down the right path and if it will eventually sort it out. We have 14 pools, but the vast majority of data resides in just one of those pools (pool 20). The pgs in the backfill state are in pool 2 (as far as I can tell). That particular pool is used for some cephfs stuff and has a handful of large files in there (not sure if this is significant to the problem). All up, our utilization is showing as 55.13% but some of our OSDs are showing as 76% in use with this one problem sitting at 85.02%. Right now, I'm just not sure what the proper corrective action is. The last couple of reweights I've run have been a bit more targetted in that I've set it to only function on two OSDs at a time. If I run a test-reweight targetting only one osd, it does say it will reweight OSD 9 (the one at 85.02%). I gather this will move data away from this OSD and potentially get it below the threshold. However, at one point in the past couple of days, it's shown as no OSDs in a near full state, yet the two pgs in backfill_toofull didn't change. So, that's why I'm not sure continually reweighting is going to solve this issue. I'm a long way from knowledgable on Ceph so I'm not really sure what information is useful here. Here's a bit of info on what I'm seeing. Can provide anything else that might help. Basically, we have a three node cluster but only two have OSDs. The third is there simply to enable a quorum to be established. The OSDs are evenly spread across these two needs and the configuration of each is identical. We are running Jewel and are not in a position to upgrade at this stage. # ceph --version ceph version 10.2.11 (e4b061b47f07f583c92a050d9e84b1813a35671e) # ceph health detail HEALTH_WARN 2 pgs backfill_toofull; 2 pgs stuck unclean; recovery 33/62099566 objects misplaced (0.000%); 1 near full osd(s) pg 2.52 is stuck unclean for 201822.031280, current state active+remapped+backfill_toofull, last acting [17,3] pg 2.18 is stuck unclean for 202114.617682, current state active+remapped+backfill_toofull, last acting [18,2] pg 2.18 is active+remapped+backfill_toofull, acting [18,2] pg 2.52 is active+remapped+backfill_toofull, acting [17,3] recovery 33/62099566 objects misplaced (0.000%) osd.9 is near full at 85% # ceph osd df ID WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR PGS 2 1.37790 1.00000 1410G 842G 496G 59.75 1.08 33 3 1.37790 0.45013 1410G 1079G 259G 76.49 1.39 21 4 1.37790 0.95001 1410G 1086G 253G 76.98 1.40 44 5 1.37790 1.00000 1410G 617G 722G 43.74 0.79 43 6 1.37790 0.65009 1410G 616G 722G 43.69 0.79 39 7 1.37790 0.95001 1410G 495G 844G 35.10 0.64 40 8 1.37790 1.00000 1410G 732G 606G 51.93 0.94 52 9 1.37790 0.70007 1410G 1199G 139G 85.02 1.54 37 10 1.37790 1.00000 1410G 611G 727G 43.35 0.79 41 11 1.37790 0.75006 1410G 495G 843G 35.11 0.64 32 0 1.37790 1.00000 1410G 731G 608G 51.82 0.94 43 12 1.37790 1.00000 1410G 851G 487G 60.36 1.09 44 13 1.37790 1.00000 1410G 378G 960G 26.82 0.49 38 14 1.37790 1.00000 1410G 969G 370G 68.68 1.25 37 15 1.37790 1.00000 1410G 724G 614G 51.35 0.93 35 16 1.37790 1.00000 1410G 491G 847G 34.84 0.63 43 17 1.37790 1.00000 1410G 862G 476G 61.16 1.11 50 18 1.37790 0.80005 1410G 1083G 255G 76.78 1.39 26 19 1.37790 0.65009 1410G 963G 375G 68.29 1.24 23 20 1.37790 1.00000 1410G 724G 614G 51.38 0.93 42 TOTAL 28219G 15557G 11227G 55.13 MIN/MAX VAR: 0.49/1.54 STDDEV: 15.57 # ceph pg ls backfill_toofull pg_stat objects mip degr misp unf bytes log disklog state state_stamp v reported up up_primary acting acting_primary last_scrub scrub_stamp last_deep_scrub deep_scrub_stamp 2.18 9 0 0 18 0 0 3653 3653 active+remapped+backfill_toofull 2020-10-29 05:31:20.429912 610'549153 656:390372 [9,12] 9 [18,2] 18 594'547482 2020-10-25 20:28:39.680744 594'543841 2020-10-21 21:21:33.092868 2.52 15 0 0 15 0 0 4883 4883 active+remapped+backfill_toofull 2020-10-29 05:31:28.277898 652'502085 656:367288 [17,9] 17 [17,3] 17 594'499108 2020-10-26 11:06:48.417825 594'499108 2020-10-26 11:06:48.417825 pool : 17 18 19 11 20 21 12 13 0 14 1 15 2 16 | SUM -------------------------------------------------------------------------------------------------------------------------------- osd.4 3 0 0 0 9 2 0 0 12 1 9 0 7 1 | 44 osd.17 1 0 0 0 7 3 1 0 8 1 17 1 11 0 | 50 osd.18 0 0 0 0 9 0 0 0 4 0 7 0 5 0 | 25 osd.5 0 0 0 2 5 1 1 0 5 0 16 0 11 2 | 43 osd.6 0 1 0 1 5 2 0 0 9 0 13 1 7 0 | 39 osd.19 0 0 1 0 8 2 0 1 2 0 6 0 3 0 | 23 osd.7 0 0 0 0 4 1 1 0 3 0 12 0 19 0 | 40 osd.8 0 1 0 0 6 3 0 2 10 1 13 1 15 0 | 52 osd.9 1 0 2 0 10 2 0 0 4 1 6 1 10 0 | 37 osd.10 0 0 1 1 5 2 0 1 7 0 12 0 11 1 | 41 osd.20 1 0 0 0 6 1 0 1 7 0 8 1 17 0 | 42 osd.11 0 0 0 0 4 1 1 1 5 0 11 0 9 0 | 32 osd.12 0 0 1 1 7 1 0 0 5 1 12 1 14 1 | 44 osd.13 0 2 0 0 3 1 0 0 10 1 11 0 10 0 | 38 osd.0 0 1 0 1 6 3 0 1 7 0 11 0 13 0 | 43 osd.14 1 0 0 0 8 1 1 0 4 1 12 0 9 0 | 37 osd.15 1 0 2 1 6 1 1 0 8 0 7 0 6 2 | 35 osd.2 0 2 1 0 7 2 1 0 7 1 4 1 6 0 | 32 osd.3 0 0 0 0 9 0 0 0 2 0 4 0 5 0 | 20 osd.16 0 1 0 1 4 3 1 1 9 0 9 1 12 1 | 43 -------------------------------------------------------------------------------------------------------------------------------- SUM : 8 8 8 8 128 32 8 8 128 8 200 8 200 8 |

3 years, 5 months

4
8
0 0

Fix PGs states

by Ing. Luis Felipe Domínguez Vega

Hi: I have this ceph status: ----------------------------------------------------------------------------- cluster: id: 039bf268-b5a6-11e9-bbb7-d06726ca4a78 health: HEALTH_WARN noout flag(s) set 1 osds down Reduced data availability: 191 pgs inactive, 2 pgs down, 35 pgs incomplete, 290 pgs stale 5 pgs not deep-scrubbed in time 7 pgs not scrubbed in time 327 slow ops, oldest one blocked for 233398 sec, daemons [osd.12,osd.36,osd.5] have slow ops. services: mon: 1 daemons, quorum fond-beagle (age 23h) mgr: fond-beagle(active, since 7h) osd: 48 osds: 45 up (since 95s), 46 in (since 8h); 4 remapped pgs flags noout data: pools: 7 pools, 2305 pgs objects: 350.37k objects, 1.5 TiB usage: 3.0 TiB used, 38 TiB / 41 TiB avail pgs: 6.681% pgs unknown 1.605% pgs not active 1835 active+clean 279 stale+active+clean 154 unknown 22 incomplete 10 stale+incomplete 2 down 2 remapped+incomplete 1 stale+remapped+incomplete -------------------------------------------------------------------------------------------- How can i fix all of unknown, incomplete, remmaped+incomplete, etc... i dont care if i need remove PGs

3 years, 5 months

5
9
0 0

14.2.12 breaks mon_host pointing to Round Robin DNS entry

by Wido den Hollander

Hi, I already submitted a ticket: https://tracker.ceph.com/issues/47951 Maybe other people noticed this as well. Situation: - Cluster is running IPv6 - mon_host is set to a DNS entry - DNS entry is a Round Robin with three AAAA-records root@wido-standard-benchmark:~# ceph -s unable to parse addrs in 'mon.objects.xx.xxx.net' [errno 22] error connecting to the cluster root@wido-standard-benchmark:~# The relevant part of the ceph.conf: [global] auth_client_required = cephx auth_cluster_required = cephx auth_service_required = cephx mon_host = mon.objects.xxx.xxx.xxx ms_bind_ipv6 = true This works fine with 14.2.11 and breaks under 14.2.12 Anybody else seeing this as well? Wido

3 years, 5 months

5
5
0 0

OSD down, how to reconstruct it from its main and block.db parts ?

by Wladimir Mutel

Dear all, after breaking my experimental 1-host Ceph cluster and making one its pg 'incomplete' I left it in abandoned state for some time. Now I decided to bring it back into life and found that it can not start one of its OSDs (osd.1 to name it) "ceph osd df" shows : ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS 0 hdd 0 1.00000 2.7 TiB 1.6 TiB 1.6 TiB 113 MiB 4.7 GiB 1.1 TiB 59.77 0.69 102 up 1 hdd 2.84549 0 0 B 0 B 0 B 0 B 0 B 0 B 0 0 0 down 2 hdd 2.84549 1.00000 2.8 TiB 2.6 TiB 2.5 TiB 57 MiB 3.8 GiB 275 GiB 90.58 1.05 176 up 3 hdd 2.84549 1.00000 2.8 TiB 2.6 TiB 2.5 TiB 57 MiB 3.9 GiB 271 GiB 90.69 1.05 185 up 4 hdd 2.84549 1.00000 2.8 TiB 2.6 TiB 2.5 TiB 63 MiB 4.2 GiB 263 GiB 90.98 1.05 184 up 5 hdd 2.84549 1.00000 2.8 TiB 2.6 TiB 2.5 TiB 52 MiB 3.8 GiB 263 GiB 90.96 1.05 178 up 6 hdd 2.53400 1.00000 2.5 TiB 2.3 TiB 2.3 TiB 173 MiB 5.2 GiB 228 GiB 91.21 1.05 178 up 7 hdd 2.53400 1.00000 2.5 TiB 2.3 TiB 2.3 TiB 147 MiB 5.2 GiB 230 GiB 91.12 1.05 168 up TOTAL 19 TiB 17 TiB 16 TiB 662 MiB 31 GiB 2.6 TiB 86.48 MIN/MAX VAR: 0.69/1.05 STDDEV: 10.90 "ceph device ls" shows : DEVICE HOST:DEV DAEMONS LIFE EXPECTANCY GIGABYTE_GP-ASACNE2100TTTDR_SN191108950380 p10s:nvme0n1 osd.1 osd.2 osd.3 osd.4 osd.5 WDC_WD30EFRX-68N32N0_WD-WCC7K1JJXVST p10s:sdd osd.1 WDC_WD30EFRX-68N32N0_WD-WCC7K1VUYPRA p10s:sda osd.6 WDC_WD30EFRX-68N32N0_WD-WCC7K2CKX8NT p10s:sdb osd.7 WDC_WD30EFRX-68N32N0_WD-WCC7K2UD8H74 p10s:sde osd.2 WDC_WD30EFRX-68N32N0_WD-WCC7K2VFTR1F p10s:sdh osd.5 WDC_WD30EFRX-68N32N0_WD-WCC7K3CYKL87 p10s:sdf osd.3 WDC_WD30EFRX-68N32N0_WD-WCC7K6FPZAJP p10s:sdc osd.0 WDC_WD30EFRX-68N32N0_WD-WCC7K7FXSCRN p10s:sdg osd.4 In my last migration, I created a bluestore volume with external block.db like this : "ceph-volume lvm prepare --bluestore --data /dev/sdd1 --block.db /dev/nvme0n1p4" And I can see this metadata by "ceph-bluestore-tool show-label --dev /dev/ceph-e53b65ba-5eb0-44f5-9160-a2328f787a0f/osd-block-8c6324a3-0364-4fad-9dcb-81a1661ee202" : { "/dev/ceph-e53b65ba-5eb0-44f5-9160-a2328f787a0f/osd-block-8c6324a3-0364-4fad-9dcb-81a1661ee202": { "osd_uuid": "8c6324a3-0364-4fad-9dcb-81a1661ee202", "size": 3000588304384, "btime": "2020-07-12T11:34:16.579735+0300", "description": "main", "bfm_blocks": "45785344", "bfm_blocks_per_key": "128", "bfm_bytes_per_block": "65536", "bfm_size": "3000588304384", "bluefs": "1", "ceph_fsid": "49cdfe90-6f6e-4afe-8558-bf14a13aadfa", "kv_backend": "rocksdb", "magic": "ceph osd volume v026", "mkfs_done": "yes", "osd_key": "AQD9ygpf+7+MABAAqtj4y1YYgxwCaAN/jgDSwg==", "ready": "ready", "require_osd_release": "14", "whoami": "1" } } and by "ceph-bluestore-tool show-label --dev /dev/nvme0n1p4" : { "/dev/nvme0n1p4": { "osd_uuid": "8c6324a3-0364-4fad-9dcb-81a1661ee202", "size": 128025886720, "btime": "2020-07-12T11:34:16.592054+0300", "description": "bluefs db" } } As you see, their osd_uuid is equal. But when I try to start it by hand : "systemctl restart ceph-osd@1" , I get this in the logs : ("journalctl -b -u ceph-osd@1") -- Logs begin at Tue 2020-10-13 19:09:49 EEST, end at Fri 2020-10-23 16:59:38 EEST. -- жов 23 16:59:36 p10s systemd[1]: Starting Ceph object storage daemon osd.1... жов 23 16:59:36 p10s systemd[1]: Started Ceph object storage daemon osd.1. жов 23 16:59:36 p10s ceph-osd[3987]: 2020-10-23T16:59:36.943+0300 7f513cebedc0 -1 auth: unable to find a keyring on /var/lib/ceph/osd/ceph-1/keyring: (2) No such file or directory жов 23 16:59:36 p10s ceph-osd[3987]: 2020-10-23T16:59:36.943+0300 7f513cebedc0 -1 auth: unable to find a keyring on /var/lib/ceph/osd/ceph-1/keyring: (2) No such file or directory жов 23 16:59:36 p10s ceph-osd[3987]: 2020-10-23T16:59:36.943+0300 7f513cebedc0 -1 AuthRegistry(0x560776222940) no keyring found at /var/lib/ceph/osd/ceph-1/keyring, disabling cephx жов 23 16:59:36 p10s ceph-osd[3987]: 2020-10-23T16:59:36.943+0300 7f513cebedc0 -1 AuthRegistry(0x560776222940) no keyring found at /var/lib/ceph/osd/ceph-1/keyring, disabling cephx жов 23 16:59:36 p10s ceph-osd[3987]: 2020-10-23T16:59:36.947+0300 7f513cebedc0 -1 auth: unable to find a keyring on /var/lib/ceph/osd/ceph-1/keyring: (2) No such file or directory жов 23 16:59:36 p10s ceph-osd[3987]: 2020-10-23T16:59:36.947+0300 7f513cebedc0 -1 auth: unable to find a keyring on /var/lib/ceph/osd/ceph-1/keyring: (2) No such file or directory жов 23 16:59:36 p10s ceph-osd[3987]: 2020-10-23T16:59:36.947+0300 7f513cebedc0 -1 AuthRegistry(0x7fff46ea5d80) no keyring found at /var/lib/ceph/osd/ceph-1/keyring, disabling cephx жов 23 16:59:36 p10s ceph-osd[3987]: 2020-10-23T16:59:36.947+0300 7f513cebedc0 -1 AuthRegistry(0x7fff46ea5d80) no keyring found at /var/lib/ceph/osd/ceph-1/keyring, disabling cephx жов 23 16:59:36 p10s ceph-osd[3987]: failed to fetch mon config (--no-mon-config to skip) жов 23 16:59:36 p10s systemd[1]: ceph-osd(a)1.service: Main process exited, code=exited, status=1/FAILURE жов 23 16:59:36 p10s systemd[1]: ceph-osd(a)1.service: Failed with result 'exit-code'. And so my question is, how to make this OSD known again to Ceph cluster without recreating it anew with ceph-volume ? I see that every folder under "/var/lib/ceph/osd/" is a tmpfs mount point filled with appropriate files and symlinks, except of "/var/lib/ceph/osd/ceph-1", which is just an empty folder not mounted anywhere. I tried to run "ceph-bluestore-tool prime-osd-dir --dev /dev/ceph-e53b65ba-5eb0-44f5-9160-a2328f787a0f/osd-block-8c6324a3-0364-4fad-9dcb-81a1661ee202 --path /var/lib/ceph/osd/ceph-1" it created some files under /var/lib/ceph/osd/ceph-1 but without tmpfs mount, and these files belonged to root. I changed owner of these files into ceph.ceph , I created appropriate symlinks for block and block.db but ceph-osd@1 did not want to start either. Only "unable to find keyring" messages disappeared. Please give any help on where to move next. Thanks in advance for your help.

3 years, 5 months

2
4
0 0

MDS_CLIENT_LATE_RELEASE: 3 clients failing to respond to capability release

by Frank Schilder

Dear cephers, I have a somewhat strange situation. I have the health warning: # ceph health detail HEALTH_WARN 3 clients failing to respond to capability release MDS_CLIENT_LATE_RELEASE 3 clients failing to respond to capability release mdsceph-12(mds.0): Client sn106.hpc.ait.dtu.dk:con-fs2-hpc failing to respond to capability release client_id: 30716617 mdsceph-12(mds.0): Client sn269.hpc.ait.dtu.dk:con-fs2-hpc failing to respond to capability release client_id: 30717358 mdsceph-12(mds.0): Client sn009.hpc.ait.dtu.dk:con-fs2-hpc failing to respond to capability release client_id: 30749150 However, these clients are not busy right now. Also, they hold almost nothing; see snippets from "session ls" below. It is possible that a very IO intensive application was running on these nodes and these release requests got stuck. How do I resolve this issue? Can I just evict the client? Version is mimic 13.2.8. Note that we execute a drop cache command after a job finishes on these clients. Its possible that the clients dropped the caps already before the MDS request was handled/received. Best regards, Frank { "id": 30717358, "num_leases": 0, "num_caps": 44, "state": "open", "request_load_avg": 0, "uptime": 6632206.332307, "replay_requests": 0, "completed_requests": 0, "reconnecting": false, "inst": "client.30717358 192.168.57.140:0/3212676185", "client_metadata": { "features": "00000000000000ff", "entity_id": "con-fs2-hpc", "hostname": "sn269.hpc.ait.dtu.dk", "kernel_version": "3.10.0-957.12.2.el7.x86_64", "root": "/hpc/home" } }, -- { "id": 30716617, "num_leases": 0, "num_caps": 48, "state": "open", "request_load_avg": 1, "uptime": 6632206.336307, "replay_requests": 0, "completed_requests": 1, "reconnecting": false, "inst": "client.30716617 192.168.56.233:0/2770977433", "client_metadata": { "features": "00000000000000ff", "entity_id": "con-fs2-hpc", "hostname": "sn106.hpc.ait.dtu.dk", "kernel_version": "3.10.0-957.12.2.el7.x86_64", "root": "/hpc/home" } }, -- { "id": 30749150, "num_leases": 0, "num_caps": 44, "state": "open", "request_load_avg": 0, "uptime": 6632206.338307, "replay_requests": 0, "completed_requests": 0, "reconnecting": false, "inst": "client.30749150 192.168.56.136:0/2578719015", "client_metadata": { "features": "00000000000000ff", "entity_id": "con-fs2-hpc", "hostname": "sn009.hpc.ait.dtu.dk", "kernel_version": "3.10.0-957.12.2.el7.x86_64", "root": "/hpc/home" } }, ================= Frank Schilder AIT Risø Campus Bygning 109, rum S14

3 years, 5 months

3
3
0 0

Very high read IO during backfilling

by Kamil Szczygieł

Hi, We're running Octopus and we've 3 control plane nodes (12 core, 64 GB memory each) that are running mon, mds and mgr and also 4 data nodes (12 core, 256 GB memory, 13x10TB HDDs each). We've increased number of PGs inside our pool, which resulted in all OSDs going crazy and reading the average of 900 M/s constantly (based on iotop). This has resulted in slow ops and very low recovery speed. Any tips on how to handle this kind of situation? We've osd_recovery_sleep_hdd set to 0.2, osd_recovery_max_active set to 5 and osd_max_backfills set to 4. Some OSDs are reporting slow ops constantly and iowait on machines is at 70-80% constantly.

3 years, 5 months

3
2
0 0

RBD low iops with 4k object size

by w1kl4s

Hello! I'm currently in process of playing around with ceph, and before migrating my servers to use it i wanted to benchmark it to have some idea what expected performance it might achieve. I am using 2 400GB Intel S3700 ssds with crush rule created as follows: ➜ ~ ceph osd crush rule create-replicated ssd_mirror default osd ssd And later creating pool ssd_test setting both size and min_size to 2. I proceeded to benchmark it with rbd bench, and using default image object size of 4M the result is about what i have been expecting (10k iops per OSD). However, when creating image with 4K block size, IOPS go down the drain, despite theoretically being better suited for it (4k writes should be amplified to 4M ones in case of 4M object size, reducing iops, while 4k object size should be the one without write amplification). Here is what i used to create the images and benchmark them: ➜ ~ rbd create -p ssd_test --size 16G --object-size 4K big_4k ➜ ~ rbd create -p ssd_test --size 16G --object-size 4M big_4m ➜ ~ rbd bench --io-type write --io-size 4K --io-pattern rand --io-total 256M ssd_test/big_4m bench type write io_size 4096 io_threads 16 bytes 268435456 pattern random SEC OPS OPS/SEC BYTES/SEC 1 8624 8640.00 35389458.37 2 12928 6472.00 26509325.75 3 22672 7562.67 30976698.73 4 37344 9340.00 38256659.85 5 52000 10403.21 42611529.30 elapsed: 6 ops: 65536 ops/sec: 10886.38 bytes/sec: 44590630.43 ➜ ~ rbd bench --io-type write --io-size 4K --io-pattern rand --io-total 256M ssd_test/big_4k bench type write io_size 4096 io_threads 16 bytes 268435456 pattern random SEC OPS OPS/SEC BYTES/SEC 1 6736 6645.67 27220675.51 2 7312 3664.00 15007751.77 3 7920 2645.33 10835290.95 4 8560 2135.46 8746841.19 5 9168 1828.03 7487596.24 6 9760 605.77 2481232.06 7 10368 610.22 2499477.33 8 10960 605.58 2480447.50 9 11536 596.15 2441847.42 10 12160 599.84 2456944.34 11 12816 610.22 2499477.33 12 13488 624.00 2555905.32 13 14144 636.29 2606249.15 14 14816 654.43 2680544.09 15 15472 661.87 2711022.99 16 16128 664.53 2721901.89 17 16800 662.40 2713191.81 18 17472 668.27 2737248.00 19 18080 650.72 2665341.09 20 18544 614.40 2516583.70 21 19056 584.20 2392875.94 22 19376 513.97 2105207.79 23 19584 416.73 1706936.95 24 19840 343.21 1405804.16 25 19984 287.77 1178705.65 26 20128 214.06 876780.01 27 20368 199.04 815255.64 28 20896 265.80 1088726.66 29 21488 339.65 1391222.48 30 22080 419.87 1719795.76 31 22688 512.82 2100513.91 32 23280 582.40 2385511.64 33 23888 598.88 2453010.08 34 24480 600.32 2458916.20 35 25104 603.83 2473304.79 36 25744 611.69 2505480.88 37 26432 629.39 2577994.94 38 27072 636.80 2608334.15 39 27760 653.39 2676272.30 40 28464 673.08 2756924.50 41 29104 673.08 2756924.50 42 29776 668.27 2737216.44 43 30400 665.60 2726299.01 44 31088 667.74 2735051.18 45 31744 654.43 2680544.09 46 32400 657.10 2691471.89 47 33056 657.05 2691283.44 48 33680 655.48 2684829.53 49 34304 642.69 2632442.61 50 35024 656.53 2689128.69 51 35616 645.27 2643006.18 52 36240 635.78 2604167.48 53 36880 641.03 2625642.38 54 37568 652.28 2671732.80 55 38192 634.62 2599385.96 56 38832 642.69 2632442.61 57 39472 647.44 2651898.81 58 40144 651.24 2667468.26 59 40816 648.56 2656512.56 60 41424 647.96 2654025.43 61 42112 653.39 2676272.30 62 42752 655.48 2684829.53 63 43392 649.08 2658636.07 64 44064 650.12 2662893.29 65 44704 653.91 2678406.49 66 45376 654.37 2680302.91 67 46048 658.67 2697926.26 68 46704 663.46 2717539.87 69 47344 654.95 2682685.09 70 48000 659.73 2702246.39 71 48672 660.26 2704411.65 72 49344 659.73 2702246.39 73 50032 665.07 2724119.71 74 50688 672.03 2752618.79 75 51344 667.20 2732847.38 76 52000 661.89 2711116.75 77 52592 650.12 2662893.29 78 53248 642.69 2632442.61 79 53920 645.37 2643426.29 80 54576 649.52 2660425.81 81 55232 649.00 2658288.92 82 55872 655.48 2684829.52 83 56544 661.32 2708752.60 84 57184 653.32 2676010.99 85 57856 656.00 2686977.39 86 58512 657.05 2691283.44 87 59216 668.27 2737216.44 88 59904 670.39 2745923.21 89 60592 679.97 2785150.68 90 61264 680.51 2787375.24 91 61936 684.25 2802700.09 92 62608 680.58 2787648.31 93 63248 667.73 2735030.17 94 63856 654.90 2682454.03 95 64496 645.88 2645539.34 96 65184 648.05 2654392.43 elapsed: 105 ops: 65536 ops/sec: 621.88 bytes/sec: 2547213.95 What could be causing such decrease in random write performance in case of 4k object size image? Best regards, w1kl4s

3 years, 5 months

1
0
0 0

Corrupted RBD image

by Ing. Luis Felipe Domínguez Vega

Hi: I tried get info from a RBD image but: ------------------------------------------------------------------------- root@fond-beagle:/# rbd list --pool cinder-ceph | grep volume-dfcca6c8-cb96-4b79-bc85-b200a061dcda > volume-dfcca6c8-cb96-4b79-bc85-b200a061dcda root@fond-beagle:/# rbd info --pool cinder-ceph volume-dfcca6c8-cb96-4b79-bc85-b200a061dcda > rbd: error opening image volume-dfcca6c8-cb96-4b79-bc85-b200a061dcda: > (2) No such file or directory ---------------------------------------------------------------------- THis is that the metadata show the image but the content was removed?

3 years, 5 months

3
3
0 0

2024

2023

2022

2021

2020

2019

ceph-users October 2020