Nautils BlueStore OSDs will not start - ceph-users

8 Jan 2021

We have several OSDs that are crashing on start.  We are running nautilus 14.2.16; here is
the relevant bit of the log:

   -10> 2021-01-08 14:52:38.800 7feec5f27c00 20 bluefs _read left 0xf1000 len 0x1000
    -9> 2021-01-08 14:52:38.800 7feec5f27c00 20 bluefs _read got 4096
    -8> 2021-01-08 14:52:38.800 7feec5f27c00 10 bluefs _replay 0x10f000: txn(seq
5972194 len 0x27 crc 0x9783dfc6)
    -7> 2021-01-08 14:52:38.800 7feec5f27c00 20 bluefs _replay 0x10f000: 
op_file_update  file(ino 68757 size 0x6041f mtime 2021-01-07 21:25:57.793664 allocated
100000 extents [1:0x17e00000~100000])
    -6> 2021-01-08 14:52:38.800 7feec5f27c00 10 bluefs _read h 0x55abfb20a3c0
0x110000~1000 from file(ino 1 size 0x110000 mtime 0.000000 allocated 500000 extents
[1:0x481700000~100000,1:0x7ad00000~400000])
    -5> 2021-01-08 14:52:38.800 7feec5f27c00 20 bluefs _read left 0xf0000 len 0x1000
    -4> 2021-01-08 14:52:38.800 7feec5f27c00 20 bluefs _read got 4096
    -3> 2021-01-08 14:52:38.800 7feec5f27c00 10 bluefs _replay 0x110000: txn(seq
5972195 len 0x116 crc 0xec6cec7)
    -2> 2021-01-08 14:52:38.800 7feec5f27c00 20 bluefs _replay 0x110000:  op_dir_link 
db/109668.log to 68759
    -1> 2021-01-08 14:52:38.804 7feec5f27c00 -1
/build/ceph-14.2.16/src/os/bluestore/BlueFS.cc<http://BlueFS.cc>: In function
'int BlueFS::_replay(bool, bool)' thread 7feec5f27c00 time 2021-01-08
14:52:38.802560
/build/ceph-14.2.16/src/os/bluestore/BlueFS.cc<http://BlueFS.cc>: 1029: FAILED
ceph_assert(file->fnode.ino) ceph version 14.2.16
(762032d6f509d5e7ee7dc008d80fe9c87086603c) nautilus (stable)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x152)
[0x55abefa51fba]
 2: (ceph::__ceph_assertf_fail(char const*, char const*, int, char const*, char const*,
...)+0) [0x55abefa52195]
 3: (BlueFS::_replay(bool, bool)+0x4fa5) [0x55abf006f8f5]
 4: (BlueFS::mount()+0x229) [0x55abf006fd79]
 5: (BlueStore::_open_bluefs(bool)+0x78) [0x55abeff57958]
 6: (BlueStore::_open_db(bool, bool, bool)+0x8a3) [0x55abeff58e63]
 7: (BlueStore::_open_db_and_around(bool)+0x44) [0x55abeff6a1a4]
 8: (BlueStore::_mount(bool, bool)+0x584) [0x55abeffc0b64]
 9: (OSD::init()+0x3f3) [0x55abefb01db3]
 10: (main()+0x5214) [0x55abefa5acf4]
 11: (__libc_start_main()+0xe7) [0x7feec279cbf7]
 12: (_start()+0x2a) [0x55abefa8c72a]     0> 2021-01-08 14:52:38.808 7feec5f27c00 -1
*** Caught signal (Aborted) **
 in thread 7feec5f27c00 thread_name:ceph-osd ceph version 14.2.16
(762032d6f509d5e7ee7dc008d80fe9c87086603c) nautilus (stable)

It seems like a version of this:
https://tracker.ceph.com/issues/45519
and maybe this https://tracker.ceph.com/issues/21087I haven't been able to get it to
start with the stupid allocator.  Here is the ceph.conf I've been using to try to get
it to start; a thread seemed to indicate that increasing the bluefs_max_log_runway would
help.

[global]
fsid =<redacted>
mon initial members = <redacted>
public_network = 10.210.20.0/22
mon_host = 10.210.20.21,10.210.20.22,10.210.20.23
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
#kernel doesn't support all features, so disable this
#rbd default features = 3
[osd]
   #set snap trim priority to lowest, 1
   osd_snap_trim_priority = 1
   osd_recovery_op_priority = 8
   osd-max-backfills = 16
   osd-max-backfills = 1
   #keep allocator commented out normally; defaults to bitmap but
   #may need stupid
   #inspired by https://tracker.ceph.com/issues/45519
   bluestore_allocator = stupid
   bluefs_allocator = stupid
   debug_bluefs = 20/20
   #set to 3x default 4194304
   bluefs_max_log_runway = 12582912

The cluster is having issues and it is urgent we get these OSDs up.

The DB is on a shared NVMe device (81G) and the disk is a 2.2TB 2.5 inch enterprise disk.

I'll be very grateful for any assistance.

Best,

Will