Hi Stefan,
looks like bluefs allocator is unable to provide additional space for
bluefs log. The root cause might be lack of free space and/or high space
fragmentation.
Prior log lines and disk configuration (e.g. ceph-bluestore-tool
bluefs-bdev-sizes) might be helpful for further analysis.
Thanks,
Igor
On 7/6/2020 1:25 PM, Stefan Kooman wrote:
> Hi List,
>
> On of our clusters there are a couple of OSDs that keep crashing:
>
>> /build/ceph-13.2.8/src/os/bluestore/BlueFS.cc: 1576: FAILED assert(r
>> == 0)
>>
>> ceph version 13.2.8 (5579a94fafbc1f9cc913a0f5d362953a5d9c3ae0) mimic
>> (stable)
>> 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char
>> const*)+0x14e) [0x7fbae37bbf1e]
>> 2: (()+0x2fc0a7) [0x7fbae37bc0a7]
>> 3: (BlueFS::_flush_and_sync_log(std::unique_lock<std::mutex>&,
>> unsigned long, unsigned long)+0x13ed) [0xc660dd]
>> 4: (BlueFS::_fsync(BlueFS::FileWriter*,
>> std::unique_lock<std::mutex>&)+0x8a) [0xc667fa]
>> 5: (BlueRocksWritableFile::Sync()+0x63) [0xc81b13]
>> 6: (rocksdb::WritableFileWriter::SyncInternal(bool)+0x2a9) [0xe45c69]
>> 7: (rocksdb::WritableFileWriter::Sync(bool)+0x21e) [0xe47ece]
>> 8: (rocksdb::BuildTable(std::__cxx11::basic_string<char,
>> std::char_traits<char>, std::allocator<char> > const&,
rocksdb::Env*,
>> rocksdb::ImmutableCFOptions const&, rocksdb::MutableCFOptions const&,
>> rocksdb::EnvOptions const&, rocksdb::TableCache*,
>> rocksdb::InternalIterator*,
>> std::unique_ptr<rocksdb::InternalIterator,
>> std::default_delete<rocksdb::InternalIterator> >,
>> rocksdb::FileMetaData*, rocksdb::InternalKeyComparator const&,
>> std::vector<std::unique_ptr<rocksdb::IntTblPropCollectorFactory,
>> std::default_delete<rocksdb::IntTblPropCollectorFactory> >,
>> std::allocator<std::unique_ptr<rocksdb::IntTblPropCollectorFactory,
>> std::default_delete<rocksdb::IntTblPropCollectorFactory> > > >
>> const*, unsigned int, std::__cxx11::basic_string<char,
>> std::char_traits<char>, std::allocator<char> > const&,
>> std::vector<unsigned long, std::allocator<unsigned long> >, unsigned
>> long, rocksdb::SnapshotChecker*, rocksdb::CompressionType,
>> rocksdb::CompressionOptions const&, bool, rocksdb::InternalStats*, roc
> ksdb::TableFileCreationReason, rocksdb::EventLogger*, int,
> rocksdb::Env::IOPriority, rocksdb::TableProperties*, int, unsigned
> long, unsigned long, rocksdb::Env::WriteLifeTimeHint)+0x1e14) [0xe72af4]
>> 9: (rocksdb::DBImpl::WriteLevel0TableForRecovery(int,
>> rocksdb::ColumnFamilyData*, rocksdb::MemTable*,
>> rocksdb::VersionEdit*)+0xcb7) [0xcf5117]
>> 10: (rocksdb::DBImpl::RecoverLogFiles(std::vector<unsigned long,
>> std::allocator<unsigned long> > const&, unsigned long*,
bool)+0x1a0a)
>> [0xcf742a]
>> 11:
>> (rocksdb::DBImpl::Recover(std::vector<rocksdb::ColumnFamilyDescriptor,
>> std::allocator<rocksdb::ColumnFamilyDescriptor> > const&, bool,
bool,
>> bool)+0x5d4) [0xcf7fd4]
>> 12: (rocksdb::DBImpl::Open(rocksdb::DBOptions const&,
>> std::__cxx11::basic_string<char, std::char_traits<char>,
>> std::allocator<char> > const&,
>> std::vector<rocksdb::ColumnFamilyDescriptor,
>> std::allocator<rocksdb::ColumnFamilyDescriptor> > const&,
>> std::vector<rocksdb::ColumnFamilyHandle*,
>> std::allocator<rocksdb::ColumnFamilyHandle*> >*, rocksdb::DB**,
>> bool)+0x6ab) [0xcf929b]
>> 13: (rocksdb::DB::Open(rocksdb::DBOptions const&,
>> std::__cxx11::basic_string<char, std::char_traits<char>,
>> std::allocator<char> > const&,
>> std::vector<rocksdb::ColumnFamilyDescriptor,
>> std::allocator<rocksdb::ColumnFamilyDescriptor> > const&,
>> std::vector<rocksdb::ColumnFamilyHandle*,
>> std::allocator<rocksdb::ColumnFamilyHandle*> >*, rocksdb::DB**)+0x22)
>> [0xcfab92]
>> 14: (RocksDBStore::do_open(std::ostream&, bool,
>> std::vector<KeyValueDB::ColumnFamily,
>> std::allocator<KeyValueDB::ColumnFamily> > const*)+0x174a) [0xc0136a]
>> 15: (BlueStore::_open_db(bool, bool)+0xd8f) [0xb8cc4f]
>> 16: (BlueStore::_mount(bool, bool)+0x4cf) [0xbbcd4f]
>> 17: (OSD::init()+0x33d) [0x751f5d]
>> 18: (main()+0x24b0) [0x643560]
>> 19: (__libc_start_main()+0xf0) [0x7fbae1762830]
>> 20: (_start()+0x29) [0x70d9f9]
>
>
> Any hints on what can be the cause of this?
>
> Gr. Stefan
>