https://drive.switch.ch/index.php/s/Jwk0Kgy7Q1EIxuE
On 08.06.20 17:30, Igor Fedotov wrote:
> I think it's better to put the log to some public cloud and paste the
> link here..
>
>
> On 6/8/2020 6:27 PM, Harald Staub wrote:
>> (really sorry for spamming, but it is still waiting for moderator, so
>> trying with xz ...)
>>
>> On 08.06.20 17:21, Harald Staub wrote:
>>> (and now with trimmed attachment because of size restriction: only
>>> the debug log)
>>>
>>> On 08.06.20 16:53, Harald Staub wrote:
>>>> (and now with attachment ...)
>>>>
>>>> On 08.06.20 16:51, Harald Staub wrote:
>>>>> Hi Igor
>>>>>
>>>>> Thank you for looking into this! I attached the complete log of
>>>>> today, with the preceding "ceph_assert(h->file->fnode.ino
!= 1)" at
>>>>> 13:13:22.609, the first "FAILED ceph_assert(is_valid_io(off,
len))"
>>>>> at 13:44:52.059, the debug log starting at 16:42:20.883.
>>>>>
>>>>> Cheers
>>>>> Harry
>>>>>
>>>>> On 08.06.20 16:37, Igor Fedotov wrote:
>>>>>> Hi Harald,
>>>>>>
>>>>>> was this exact OSD suffering from
"ceph_assert(h->file->fnode.ino
>>>>>> != 1)"?
>>>>>>
>>>>>> Could you please collect extended log with debug-bluefs set ot
20?
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Igor
>>>>>>
>>>>>> On 6/8/2020 4:48 PM, Harald Staub wrote:
>>>>>>> This is again about our bad cluster, with far too many
objects.
>>>>>>> Now another OSD crashes immediately at startup:
>>>>>>>
>>>>>>> /build/ceph-14.2.8/src/os/bluestore/KernelDevice.cc: 944:
FAILED
>>>>>>> ceph_assert(is_valid_io(off, len))
>>>>>>> 1: (ceph::__ceph_assert_fail(char const*, char const*, int,
char
>>>>>>> const*)+0x152) [0x5601938e0e92]
>>>>>>> 2: (ceph::__ceph_assertf_fail(char const*, char const*, int,
>>>>>>> char const*, char const*, ...)+0) [0x5601938e106d]
>>>>>>> 3: (KernelDevice::read(unsigned long, unsigned long,
>>>>>>> ceph::buffer::v14_2_0::list*, IOContext*, bool)+0x8e0)
>>>>>>> [0x560193f2ae90]
>>>>>>> 4: (BlueFS::_read(BlueFS::FileReader*,
>>>>>>> BlueFS::FileReaderBuffer*, unsigned long, unsigned long,
>>>>>>> ceph::buffer::v14_2_0::list*, char*)+0x4f6) [0x560193ee0506]
>>>>>>> 5: (BlueFS::_replay(bool, bool)+0x489) [0x560193ee14e9]
>>>>>>> 6: (BlueFS::mount()+0x219) [0x560193ef4319]
>>>>>>> 7: (BlueStore::_open_bluefs(bool)+0x41) [0x560193de2281]
>>>>>>> 8: (BlueStore::_open_db(bool, bool, bool)+0x88c)
[0x560193de347c]
>>>>>>> 9: (BlueStore::_open_db_and_around(bool)+0x44)
[0x560193dfa134]
>>>>>>> 10: (BlueStore::_mount(bool, bool)+0x584) [0x560193e4a804]
>>>>>>> 11: (OSD::init()+0x3b7) [0x56019398f957]
>>>>>>> 12: (main()+0x3cdb) [0x5601938e85cb]
>>>>>>> 13: (__libc_start_main()+0xe7) [0x7f54cdf1bb97]
>>>>>>> 14: (_start()+0x2a) [0x56019391b08a]
>>>>>>>
>>>>>>> 2020-06-08 13:44:52.063 7f54d169ec00 -1 *** Caught signal
>>>>>>> (Aborted) **
>>>>>>> in thread 7f54d169ec00 thread_name:ceph-osd
>>>>>>>
>>>>>>> ceph version 14.2.8
(2d095e947a02261ce61424021bb43bd3022d35cb)
>>>>>>> nautilus (stable)
>>>>>>> 1: (()+0x12890) [0x7f54cf286890]
>>>>>>> 2: (gsignal()+0xc7) [0x7f54cdf38e97]
>>>>>>> 3: (abort()+0x141) [0x7f54cdf3a801]
>>>>>>> 4: (ceph::__ceph_assert_fail(char const*, char const*, int,
char
>>>>>>> const*)+0x1a3) [0x5601938e0ee3]
>>>>>>> 5: (ceph::__ceph_assertf_fail(char const*, char const*, int,
>>>>>>> char const*, char const*, ...)+0) [0x5601938e106d]
>>>>>>> 6: (KernelDevice::read(unsigned long, unsigned long,
>>>>>>> ceph::buffer::v14_2_0::list*, IOContext*, bool)+0x8e0)
>>>>>>> [0x560193f2ae90]
>>>>>>> 7: (BlueFS::_read(BlueFS::FileReader*,
>>>>>>> BlueFS::FileReaderBuffer*, unsigned long, unsigned long,
>>>>>>> ceph::buffer::v14_2_0::list*, char*)+0x4f6) [0x560193ee0506]
>>>>>>> 8: (BlueFS::_replay(bool, bool)+0x489) [0x560193ee14e9]
>>>>>>> 9: (BlueFS::mount()+0x219) [0x560193ef4319]
>>>>>>> 10: (BlueStore::_open_bluefs(bool)+0x41) [0x560193de2281]
>>>>>>> 11: (BlueStore::_open_db(bool, bool, bool)+0x88c)
[0x560193de347c]
>>>>>>> 12: (BlueStore::_open_db_and_around(bool)+0x44)
[0x560193dfa134]
>>>>>>> 13: (BlueStore::_mount(bool, bool)+0x584) [0x560193e4a804]
>>>>>>> 14: (OSD::init()+0x3b7) [0x56019398f957]
>>>>>>> 15: (main()+0x3cdb) [0x5601938e85cb]
>>>>>>> 16: (__libc_start_main()+0xe7) [0x7f54cdf1bb97]
>>>>>>> 17: (_start()+0x2a) [0x56019391b08a]
>>>>>>> NOTE: a copy of the executable, or `objdump -rdS
<executable>`
>>>>>>> is needed to interpret this.
>>>>>>>
>>>>>>> There was a preceding assert (earlier written about):
>>>>>>>
>>>>>>> /build/ceph-14.2.8/src/os/bluestore/BlueFS.cc: 2261: FAILED
>>>>>>> ceph_assert(h->file->fnode.ino != 1)
>>>>>>>
>>>>>>> Any ideas that I could try to save this OSDs?
>>>>>>>
>>>>>>> Cheers
>>>>>>> Harry
>>>>>>> _______________________________________________
>>>>>>> ceph-users mailing list -- ceph-users(a)ceph.io
>>>>>>> To unsubscribe send an email to ceph-users-leave(a)ceph.io
>> _______________________________________________
>> ceph-users mailing list -- ceph-users(a)ceph.io
>> To unsubscribe send an email to ceph-users-leave(a)ceph.io