May 2020 - ceph-users - lists.ceph.io

by Khodayar Doustar

Hi, I have a 3 node cluster of mimic with 9 osds (3 osds on each node). I use this cluster to test integration of an application with S3 api. The problem is that after a few days all OSD starts filling up with bluestore logs and goes down and out one by one! I cannot stop the logs and I cannot find the setting to fix this leakage, this should be a leakage in logs because it's not logical to fill up all OSD with bluefs logs. This is an example of logs which is being repeated in bluestore logs: [root@server2 ~]# ceph-bluestore-tool --command bluefs-log-dump --path /var/lib/ceph/osd/ceph-5 . . . 0x40d000: op_file_update file(ino 30 size 0x7a713 mtime 2020-04-18 15:54:29.056488 bdev 1 allocated 80000 extents [1:0x130000+10000,1:0x100000+10000,1:0x140000+10000,1:0x150000+10000,1:0x160000+10000,1:0x170000+10000,1:0x180000+10000,1:0x190000+10000]) 0x40e000: txn(seq 1156100 len 0x78 crc 0x3e1c626f) 0x40e000: op_file_update file(ino 30 size 0x7a72a mtime 2020-04-18 15:54:30.057828 bdev 1 allocated 80000 extents [1:0x130000+10000,1:0x100000+10000,1:0x140000+10000,1:0x150000+10000,1:0x160000+10000,1:0x170000+10000,1:0x180000+10000,1:0x190000+10000]) 0x40f000: txn(seq 1156101 len 0x78 crc 0xc1f9ec5f) 0x40f000: op_file_update file(ino 30 size 0x7a741 mtime 2020-04-18 15:54:31.059252 bdev 1 allocated 80000 extents [1:0x130000+10000,1:0x100000+10000,1:0x140000+10000,1:0x150000+10000,1:0x160000+10000,1:0x170000+10000,1:0x180000+10000,1:0x190000+10000]) *** Caught signal (Segmentation fault) ** in thread 7f4108cfe600 thread_name:ceph-bluestore- ceph version 13.2.8 (5579a94fafbc1f9cc913a0f5d362953a5d9c3ae0) mimic (stable) 1: (()+0xf5f0) [0x7f40fd47f5f0] 2: (BlueFS::_read(BlueFS::FileReader*, BlueFS::FileReaderBuffer*, unsigned long, unsigned long, ceph::buffer::list*, char*)+0x3bf) [0x55603a1ebb3f] 3: (BlueFS::_replay(bool, bool)+0x214) [0x55603a1f6654] 4: (BlueFS::log_dump(CephContext*, std::string const&, std::vector<std::string, std::allocator<std::string> > const&)+0x3b) [0x55603a1fad3b] 5: (log_dump(CephContext*, std::string const&, std::vector<std::string, std::allocator<std::string> > const&)+0x64) [0x55603a1da764] 6: (main()+0x2f3e) [0x55603a105e3e] 7: (__libc_start_main()+0xf5) [0x7f40fbe51505] 8: (()+0x23115f) [0x55603a1d915f] 2020-04-18 21:08:25.410 7f4108cfe600 -1 *** Caught signal (Segmentation fault) ** in thread 7f4108cfe600 thread_name:ceph-bluestore- . . . And this is the output of daemonperf for one of remaining (filling up!) OSDs: [root@server2 ~]# ceph daemonperf osd.5 ------bluefs------- ------------bluestore------------- ----------osd----------- jlen j wal sst |fl_l k_l io_l th_l s_l c_l r_l |ops wr rd l rop | 81M 0 0 0 | 0 0 0 0 0 0 0 | 0 0 0 0 0 81M 4.0k 1.8k 0 | 0 0 0 0 0 0 0 | 0 0 0 0 0 81M 4.0k 1.9k 0 | 0 0 0 0 0 0 0 | 0 0 0 0 0 81M 4.0k 1.9k 0 | 0 0 0 0 0 0 0 | 0 0 0 0 0 81M 4.0k 1.9k 0 | 0 0 0 0 0 0 0 | 0 0 0 0 0 ^C[root@server2 ~]# And this is the enospc logs when trying to start OSD: [root@server2 ~]# journalctl -u -f ceph-osd(a)4.service Failed to add match 'ceph-osd(a)4.service': Invalid argument Failed to add filters: Invalid argument [root@server2 ~]# journalctl -f -u ceph-osd(a)4.service -- Logs begin at Sat 2020-04-18 15:38:15 +0430. -- Apr 18 21:18:22 server2 ceph-osd[17485]: 22: (()+0x378c10) [0x557348ebac10] Apr 18 21:18:22 server2 ceph-osd[17485]: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. Apr 18 21:18:22 server2 systemd[1]: ceph-osd(a)4.service: main process exited, code=killed, status=6/ABRT Apr 18 21:18:22 server2 systemd[1]: Unit ceph-osd(a)4.service entered failed state. Apr 18 21:18:22 server2 systemd[1]: ceph-osd(a)4.service failed. Apr 18 21:18:42 server2 systemd[1]: ceph-osd(a)4.service holdoff time over, scheduling restart. Apr 18 21:18:42 server2 systemd[1]: Stopped Ceph object storage daemon osd.4. Apr 18 21:18:42 server2 systemd[1]: Starting Ceph object storage daemon osd.4... Apr 18 21:18:42 server2 systemd[1]: Started Ceph object storage daemon osd.4. Apr 18 21:18:43 server2 ceph-osd[17777]: starting osd.4 at - osd_data /var/lib/ceph/osd/ceph-4 /var/lib/ceph/osd/ceph-4/journal Apr 18 21:24:37 server2 ceph-osd[17777]: /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/13.2.8/rpm/el7/BUILD/ceph-13.2.8/src/os/bluestore/BlueFS.cc: In function 'int BlueFS::_flush_range(BlueFS::FileWriter*, uint64_t, uint64_t)' thread 7f43c936bb80 time 2020-04-18 21:24:37.091289 Apr 18 21:24:37 server2 ceph-osd[17777]: /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/13.2.8/rpm/el7/BUILD/ceph-13.2.8/src/os/bluestore/BlueFS.cc: 1704: FAILED assert(0 == "bluefs enospc") Apr 18 21:24:37 server2 ceph-osd[17777]: 2020-04-18 21:24:37.090 7f43c936bb80 -1 bluefs _allocate failed to allocate 0x on bdev 2, dne Apr 18 21:24:37 server2 ceph-osd[17777]: 2020-04-18 21:24:37.090 7f43c936bb80 -1 bluefs _flush_range allocated: 0x0 offset: 0x0 length: 0x794de9 Apr 18 21:24:37 server2 ceph-osd[17777]: ceph version 13.2.8 (5579a94fafbc1f9cc913a0f5d362953a5d9c3ae0) mimic (stable) Apr 18 21:24:37 server2 ceph-osd[17777]: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x14b) [0x7f43c074987b] Apr 18 21:24:37 server2 ceph-osd[17777]: 2: (()+0x26fa07) [0x7f43c0749a07] Apr 18 21:24:37 server2 ceph-osd[17777]: 3: (BlueFS::_flush_range(BlueFS::FileWriter*, unsigned long, unsigned long)+0x1ac6) [0x55ad52029266] Apr 18 21:24:37 server2 ceph-osd[17777]: 4: (BlueRocksWritableFile::Flush()+0x3d) [0x55ad520451bd] Apr 18 21:24:37 server2 ceph-osd[17777]: 5: (rocksdb::WritableFileWriter::Flush()+0x196) [0x55ad521f7916] Apr 18 21:24:37 server2 ceph-osd[17777]: 6: (rocksdb::WritableFileWriter::Sync(bool)+0x2e) [0x55ad521f7bde] Apr 18 21:24:37 server2 ceph-osd[17777]: 7: (rocksdb::BuildTable(std::string const&, rocksdb::Env*, rocksdb::ImmutableCFOptions const&, rocksdb::MutableCFOptions const&, rocksdb::EnvOptions const&, rocksdb::TableCache*, rocksdb::InternalIterator*, std::unique_ptr<rocksdb::InternalIterator, std::default_delete<rocksdb::InternalIterator> >, rocksdb::FileMetaData*, rocksdb::InternalKeyComparator const&, std::vector<std::unique_ptr<rocksdb::IntTblPropCollectorFactory, std::default_delete<rocksdb::IntTblPropCollectorFactory> >, std::allocator<std::unique_ptr<rocksdb::IntTblPropCollectorFactory, std::default_delete<rocksdb::IntTblPropCollectorFactory> > > > const*, unsigned int, std::string const&, std::vector<unsigned long, std::allocator<unsigned long> >, unsigned long, rocksdb::SnapshotChecker*, rocksdb::CompressionType, rocksdb::CompressionOptions const&, bool, rocksdb::InternalStats*, rocksdb::TableFileCreationReason, rocksdb::EventLogger*, int, rocksdb::Env::IOPriority, rocksdb::TableProperties*, int, unsigned long, unsigned long, rocksdb::Env::WriteLifeTimeHint)+0x11d8) [0x55ad5221d5a8] Apr 18 21:24:37 server2 ceph-osd[17777]: 8: (rocksdb::DBImpl::WriteLevel0TableForRecovery(int, rocksdb::ColumnFamilyData*, rocksdb::MemTable*, rocksdb::VersionEdit*)+0xbe6) [0x55ad520b0d76] Apr 18 21:24:37 server2 ceph-osd[17777]: 9: (rocksdb::DBImpl::RecoverLogFiles(std::vector<unsigned long, std::allocator<unsigned long> > const&, unsigned long*, bool)+0x185b) [0x55ad520b2dcb] Apr 18 21:24:37 server2 ceph-osd[17777]: 10: (rocksdb::DBImpl::Recover(std::vector<rocksdb::ColumnFamilyDescriptor, std::allocator<rocksdb::ColumnFamilyDescriptor> > const&, bool, bool, bool)+0xa59) [0x55ad520b3d09] Apr 18 21:24:37 server2 ceph-osd[17777]: 11: (rocksdb::DBImpl::Open(rocksdb::DBOptions const&, std::string const&, std::vector<rocksdb::ColumnFamilyDescriptor, std::allocator<rocksdb::ColumnFamilyDescriptor> > const&, std::vector<rocksdb::ColumnFamilyHandle*, std::allocator<rocksdb::ColumnFamilyHandle*> >*, rocksdb::DB**, bool)+0x689) [0x55ad520b4ab9] Apr 18 21:24:37 server2 ceph-osd[17777]: 12: (rocksdb::DB::Open(rocksdb::DBOptions const&, std::string const&, std::vector<rocksdb::ColumnFamilyDescriptor, std::allocator<rocksdb::ColumnFamilyDescriptor> > const&, std::vector<rocksdb::ColumnFamilyHandle*, std::allocator<rocksdb::ColumnFamilyHandle*> >*, rocksdb::DB**)+0x22) [0x55ad520b62e2] Apr 18 21:24:37 server2 ceph-osd[17777]: 13: (RocksDBStore::do_open(std::ostream&, bool, std::vector<KeyValueDB::ColumnFamily, std::allocator<KeyValueDB::ColumnFamily> > const*)+0x164e) [0x55ad51fc65de] Apr 18 21:24:37 server2 ceph-osd[17777]: 14: (BlueStore::_open_db(bool, bool)+0xcf4) [0x55ad51f527a4] Apr 18 21:24:37 server2 ceph-osd[17777]: 15: (BlueStore::_mount(bool, bool)+0x4e9) [0x55ad51f828f9] Apr 18 21:24:37 server2 ceph-osd[17777]: 16: (OSD::init()+0x339) [0x55ad51b1cf09] Apr 18 21:24:37 server2 ceph-osd[17777]: 17: (main()+0x23d2) [0x55ad51a00d52] Apr 18 21:24:37 server2 ceph-osd[17777]: 18: (__libc_start_main()+0xf5) [0x7f43bc6c7505] Apr 18 21:24:37 server2 ceph-osd[17777]: 19: (()+0x378c10) [0x55ad51ad8c10] Apr 18 21:24:37 server2 ceph-osd[17777]: *** Caught signal (Aborted) ** Apr 18 21:24:37 server2 ceph-osd[17777]: in thread 7f43c936bb80 thread_name:ceph-osd Apr 18 21:24:37 server2 ceph-osd[17777]: 2020-04-18 21:24:37.094 7f43c936bb80 -1 /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/13.2.8/rpm/el7/BUILD/ceph-13.2.8/src/os/bluestore/BlueFS.cc: In function 'int BlueFS::_flush_range(BlueFS::FileWriter*, uint64_t, uint64_t)' thread 7f43c936bb80 time 2020-04-18 21:24:37.091289 Apr 18 21:24:37 server2 ceph-osd[17777]: /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/13.2.8/rpm/el7/BUILD/ceph-13.2.8/src/os/bluestore/BlueFS.cc: 1704: FAILED assert(0 == "bluefs enospc") Apr 18 21:24:37 server2 ceph-osd[17777]: ceph version 13.2.8 (5579a94fafbc1f9cc913a0f5d362953a5d9c3ae0) mimic (stable) Apr 18 21:24:37 server2 ceph-osd[17777]: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x14b) [0x7f43c074987b] Apr 18 21:24:37 server2 ceph-osd[17777]: 2: (()+0x26fa07) [0x7f43c0749a07] Apr 18 21:24:37 server2 ceph-osd[17777]: 3: (BlueFS::_flush_range(BlueFS::FileWriter*, unsigned long, unsigned long)+0x1ac6) [0x55ad52029266] Apr 18 21:24:37 server2 ceph-osd[17777]: 4: (BlueRocksWritableFile::Flush()+0x3d) [0x55ad520451bd] Apr 18 21:24:37 server2 ceph-osd[17777]: 5: (rocksdb::WritableFileWriter::Flush()+0x196) [0x55ad521f7916] Apr 18 21:24:37 server2 ceph-osd[17777]: 6: (rocksdb::WritableFileWriter::Sync(bool)+0x2e) [0x55ad521f7bde] Apr 18 21:24:37 server2 ceph-osd[17777]: 7: (rocksdb::BuildTable(std::string const&, rocksdb::Env*, rocksdb::ImmutableCFOptions const&, rocksdb::MutableCFOptions const&, rocksdb::EnvOptions const&, rocksdb::TableCache*, rocksdb::InternalIterator*, std::unique_ptr<rocksdb::InternalIterator, std::default_delete<rocksdb::InternalIterator> >, rocksdb::FileMetaData*, rocksdb::InternalKeyComparator const&, std::vector<std::unique_ptr<rocksdb::IntTblPropCollectorFactory, std::default_delete<rocksdb::IntTblPropCollectorFactory> >, std::allocator<std::unique_ptr<rocksdb::IntTblPropCollectorFactory, std::default_delete<rocksdb::IntTblPropCollectorFactory> > > > const*, unsigned int, std::string const&, std::vector<unsigned long, std::allocator<unsigned long> >, unsigned long, rocksdb::SnapshotChecker*, rocksdb::CompressionType, rocksdb::CompressionOptions const&, bool, rocksdb::InternalStats*, rocksdb::TableFileCreationReason, rocksdb::EventLogger*, int, rocksdb::Env::IOPriority, rocksdb::TableProperties*, int, unsigned long, unsigned long, rocksdb::Env::WriteLifeTimeHint)+0x11d8) [0x55ad5221d5a8] Apr 18 21:24:37 server2 ceph-osd[17777]: 8: (rocksdb::DBImpl::WriteLevel0TableForRecovery(int, rocksdb::ColumnFamilyData*, rocksdb::MemTable*, rocksdb::VersionEdit*)+0xbe6) [0x55ad520b0d76] Apr 18 21:24:37 server2 ceph-osd[17777]: 9: (rocksdb::DBImpl::RecoverLogFiles(std::vector<unsigned long, std::allocator<unsigned long> > const&, unsigned long*, bool)+0x185b) [0x55ad520b2dcb] Apr 18 21:24:37 server2 ceph-osd[17777]: 10: (rocksdb::DBImpl::Recover(std::vector<rocksdb::ColumnFamilyDescriptor, std::allocator<rocksdb::ColumnFamilyDescriptor> > const&, bool, bool, bool)+0xa59) [0x55ad520b3d09] Apr 18 21:24:37 server2 ceph-osd[17777]: 11: (rocksdb::DBImpl::Open(rocksdb::DBOptions const&, std::string const&, std::vector<rocksdb::ColumnFamilyDescriptor, std::allocator<rocksdb::ColumnFamilyDescriptor> > const&, std::vector<rocksdb::ColumnFamilyHandle*, std::allocator<rocksdb::ColumnFamilyHandle*> >*, rocksdb::DB**, bool)+0x689) [0x55ad520b4ab9] Apr 18 21:24:37 server2 ceph-osd[17777]: 12: (rocksdb::DB::Open(rocksdb::DBOptions const&, std::string const&, std::vector<rocksdb::ColumnFamilyDescriptor, std::allocator<rocksdb::ColumnFamilyDescriptor> > const&, std::vector<rocksdb::ColumnFamilyHandle*, std::allocator<rocksdb::ColumnFamilyHandle*> >*, rocksdb::DB**)+0x22) [0x55ad520b62e2] Apr 18 21:24:37 server2 ceph-osd[17777]: 13: (RocksDBStore::do_open(std::ostream&, bool, std::vector<KeyValueDB::ColumnFamily, std::allocator<KeyValueDB::ColumnFamily> > const*)+0x164e) [0x55ad51fc65de] Apr 18 21:24:37 server2 ceph-osd[17777]: 14: (BlueStore::_open_db(bool, bool)+0xcf4) [0x55ad51f527a4] Apr 18 21:24:37 server2 ceph-osd[17777]: 15: (BlueStore::_mount(bool, bool)+0x4e9) [0x55ad51f828f9] Apr 18 21:24:37 server2 ceph-osd[17777]: 16: (OSD::init()+0x339) [0x55ad51b1cf09] Apr 18 21:24:37 server2 ceph-osd[17777]: 17: (main()+0x23d2) [0x55ad51a00d52] Apr 18 21:24:37 server2 ceph-osd[17777]: 18: (__libc_start_main()+0xf5) [0x7f43bc6c7505] Apr 18 21:24:37 server2 ceph-osd[17777]: 19: (()+0x378c10) [0x55ad51ad8c10] Apr 18 21:24:37 server2 ceph-osd[17777]: ceph version 13.2.8 (5579a94fafbc1f9cc913a0f5d362953a5d9c3ae0) mimic (stable) Apr 18 21:24:37 server2 ceph-osd[17777]: 1: (()+0xf5f0) [0x7f43bd6bb5f0] Apr 18 21:24:37 server2 ceph-osd[17777]: 2: (gsignal()+0x37) [0x7f43bc6db337] Apr 18 21:24:37 server2 ceph-osd[17777]: 3: (abort()+0x148) [0x7f43bc6dca28] Apr 18 21:24:37 server2 ceph-osd[17777]: 4: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x248) [0x7f43c0749978] Apr 18 21:24:37 server2 ceph-osd[17777]: 5: (()+0x26fa07) [0x7f43c0749a07] Apr 18 21:24:37 server2 ceph-osd[17777]: 6: (BlueFS::_flush_range(BlueFS::FileWriter*, unsigned long, unsigned long)+0x1ac6) [0x55ad52029266] Apr 18 21:24:37 server2 ceph-osd[17777]: 7: (BlueRocksWritableFile::Flush()+0x3d) [0x55ad520451bd] Apr 18 21:24:37 server2 ceph-osd[17777]: 8: (rocksdb::WritableFileWriter::Flush()+0x196) [0x55ad521f7916] Apr 18 21:24:37 server2 ceph-osd[17777]: 9: (rocksdb::WritableFileWriter::Sync(bool)+0x2e) [0x55ad521f7bde] Apr 18 21:24:37 server2 ceph-osd[17777]: 10: (rocksdb::BuildTable(std::string const&, rocksdb::Env*, rocksdb::ImmutableCFOptions const&, rocksdb::MutableCFOptions const&, rocksdb::EnvOptions const&, rocksdb::TableCache*, rocksdb::InternalIterator*, std::unique_ptr<rocksdb::InternalIterator, std::default_delete<rocksdb::InternalIterator> >, rocksdb::FileMetaData*, rocksdb::InternalKeyComparator const&, std::vector<std::unique_ptr<rocksdb::IntTblPropCollectorFactory, std::default_delete<rocksdb::IntTblPropCollectorFactory> >, std::allocator<std::unique_ptr<rocksdb::IntTblPropCollectorFactory, std::default_delete<rocksdb::IntTblPropCollectorFactory> > > > const*, unsigned int, std::string const&, std::vector<unsigned long, std::allocator<unsigned long> >, unsigned long, rocksdb::SnapshotChecker*, rocksdb::CompressionType, rocksdb::CompressionOptions const&, bool, rocksdb::InternalStats*, rocksdb::TableFileCreationReason, rocksdb::EventLogger*, int, rocksdb::Env::IOPriority, rocksdb::TableProperties*, int, unsigned long, unsigned long, rocksdb::Env::WriteLifeTimeHint)+0x11d8) [0x55ad5221d5a8] Apr 18 21:24:37 server2 ceph-osd[17777]: 11: (rocksdb::DBImpl::WriteLevel0TableForRecovery(int, rocksdb::ColumnFamilyData*, rocksdb::MemTable*, rocksdb::VersionEdit*)+0xbe6) [0x55ad520b0d76] Apr 18 21:24:37 server2 ceph-osd[17777]: 12: (rocksdb::DBImpl::RecoverLogFiles(std::vector<unsigned long, std::allocator<unsigned long> > const&, unsigned long*, bool)+0x185b) [0x55ad520b2dcb] Apr 18 21:24:37 server2 ceph-osd[17777]: 13: (rocksdb::DBImpl::Recover(std::vector<rocksdb::ColumnFamilyDescriptor, std::allocator<rocksdb::ColumnFamilyDescriptor> > const&, bool, bool, bool)+0xa59) [0x55ad520b3d09] Apr 18 21:24:37 server2 ceph-osd[17777]: 14: (rocksdb::DBImpl::Open(rocksdb::DBOptions const&, std::string const&, std::vector<rocksdb::ColumnFamilyDescriptor, std::allocator<rocksdb::ColumnFamilyDescriptor> > const&, std::vector<rocksdb::ColumnFamilyHandle*, std::allocator<rocksdb::ColumnFamilyHandle*> >*, rocksdb::DB**, bool)+0x689) [0x55ad520b4ab9] Apr 18 21:24:37 server2 ceph-osd[17777]: 15: (rocksdb::DB::Open(rocksdb::DBOptions const&, std::string const&, std::vector<rocksdb::ColumnFamilyDescriptor, std::allocator<rocksdb::ColumnFamilyDescriptor> > const&, std::vector<rocksdb::ColumnFamilyHandle*, std::allocator<rocksdb::ColumnFamilyHandle*> >*, rocksdb::DB**)+0x22) [0x55ad520b62e2] Apr 18 21:24:37 server2 ceph-osd[17777]: 16: (RocksDBStore::do_open(std::ostream&, bool, std::vector<KeyValueDB::ColumnFamily, std::allocator<KeyValueDB::ColumnFamily> > const*)+0x164e) [0x55ad51fc65de] Apr 18 21:24:37 server2 ceph-osd[17777]: 17: (BlueStore::_open_db(bool, bool)+0xcf4) [0x55ad51f527a4] Apr 18 21:24:37 server2 ceph-osd[17777]: 18: (BlueStore::_mount(bool, bool)+0x4e9) [0x55ad51f828f9] Apr 18 21:24:37 server2 ceph-osd[17777]: 19: (OSD::init()+0x339) [0x55ad51b1cf09] Apr 18 21:24:37 server2 ceph-osd[17777]: 20: (main()+0x23d2) [0x55ad51a00d52] Apr 18 21:24:37 server2 ceph-osd[17777]: 21: (__libc_start_main()+0xf5) [0x7f43bc6c7505] Apr 18 21:24:37 server2 ceph-osd[17777]: 22: (()+0x378c10) [0x55ad51ad8c10] Apr 18 21:24:37 server2 ceph-osd[17777]: 2020-04-18 21:24:37.099 7f43c936bb80 -1 *** Caught signal (Aborted) ** Apr 18 21:24:37 server2 ceph-osd[17777]: in thread 7f43c936bb80 thread_name:ceph-osd Apr 18 21:24:37 server2 ceph-osd[17777]: ceph version 13.2.8 (5579a94fafbc1f9cc913a0f5d362953a5d9c3ae0) mimic (stable) Apr 18 21:24:37 server2 ceph-osd[17777]: 1: (()+0xf5f0) [0x7f43bd6bb5f0] Apr 18 21:24:37 server2 ceph-osd[17777]: 2: (gsignal()+0x37) [0x7f43bc6db337] Apr 18 21:24:37 server2 ceph-osd[17777]: 3: (abort()+0x148) [0x7f43bc6dca28] Apr 18 21:24:37 server2 ceph-osd[17777]: 4: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x248) [0x7f43c0749978] Apr 18 21:24:37 server2 ceph-osd[17777]: 5: (()+0x26fa07) [0x7f43c0749a07] Apr 18 21:24:37 server2 ceph-osd[17777]: 6: (BlueFS::_flush_range(BlueFS::FileWriter*, unsigned long, unsigned long)+0x1ac6) [0x55ad52029266] Apr 18 21:24:37 server2 ceph-osd[17777]: 7: (BlueRocksWritableFile::Flush()+0x3d) [0x55ad520451bd] Apr 18 21:24:37 server2 ceph-osd[17777]: 8: (rocksdb::WritableFileWriter::Flush()+0x196) [0x55ad521f7916] Apr 18 21:24:37 server2 ceph-osd[17777]: 9: (rocksdb::WritableFileWriter::Sync(bool)+0x2e) [0x55ad521f7bde] Apr 18 21:24:37 server2 ceph-osd[17777]: 10: (rocksdb::BuildTable(std::string const&, rocksdb::Env*, rocksdb::ImmutableCFOptions const&, rocksdb::MutableCFOptions const&, rocksdb::EnvOptions const&, rocksdb::TableCache*, rocksdb::InternalIterator*, std::unique_ptr<rocksdb::InternalIterator, std::default_delete<rocksdb::InternalIterator> >, rocksdb::FileMetaData*, rocksdb::InternalKeyComparator const&, std::vector<std::unique_ptr<rocksdb::IntTblPropCollectorFactory, std::default_delete<rocksdb::IntTblPropCollectorFactory> >, std::allocator<std::unique_ptr<rocksdb::IntTblPropCollectorFactory, std::default_delete<rocksdb::IntTblPropCollectorFactory> > > > const*, unsigned int, std::string const&, std::vector<unsigned long, std::allocator<unsigned long> >, unsigned long, rocksdb::SnapshotChecker*, rocksdb::CompressionType, rocksdb::CompressionOptions const&, bool, rocksdb::InternalStats*, rocksdb::TableFileCreationReason, rocksdb::EventLogger*, int, rocksdb::Env::IOPriority, rocksdb::TableProperties*, int, unsigned long, unsigned long, rocksdb::Env::WriteLifeTimeHint)+0x11d8) [0x55ad5221d5a8] Apr 18 21:24:37 server2 ceph-osd[17777]: 11: (rocksdb::DBImpl::WriteLevel0TableForRecovery(int, rocksdb::ColumnFamilyData*, rocksdb::MemTable*, rocksdb::VersionEdit*)+0xbe6) [0x55ad520b0d76] Apr 18 21:24:37 server2 ceph-osd[17777]: 12: (rocksdb::DBImpl::RecoverLogFiles(std::vector<unsigned long, std::allocator<unsigned long> > const&, unsigned long*, bool)+0x185b) [0x55ad520b2dcb] Apr 18 21:24:37 server2 ceph-osd[17777]: 13: (rocksdb::DBImpl::Recover(std::vector<rocksdb::ColumnFamilyDescriptor, std::allocator<rocksdb::ColumnFamilyDescriptor> > const&, bool, bool, bool)+0xa59) [0x55ad520b3d09] Apr 18 21:24:37 server2 ceph-osd[17777]: 14: (rocksdb::DBImpl::Open(rocksdb::DBOptions const&, std::string const&, std::vector<rocksdb::ColumnFamilyDescriptor, std::allocator<rocksdb::ColumnFamilyDescriptor> > const&, std::vector<rocksdb::ColumnFamilyHandle*, std::allocator<rocksdb::ColumnFamilyHandle*> >*, rocksdb::DB**, bool)+0x689) [0x55ad520b4ab9] Apr 18 21:24:37 server2 ceph-osd[17777]: 15: (rocksdb::DB::Open(rocksdb::DBOptions const&, std::string const&, std::vector<rocksdb::ColumnFamilyDescriptor, std::allocator<rocksdb::ColumnFamilyDescriptor> > const&, std::vector<rocksdb::ColumnFamilyHandle*, std::allocator<rocksdb::ColumnFamilyHandle*> >*, rocksdb::DB**)+0x22) [0x55ad520b62e2] Apr 18 21:24:37 server2 ceph-osd[17777]: 16: (RocksDBStore::do_open(std::ostream&, bool, std::vector<KeyValueDB::ColumnFamily, std::allocator<KeyValueDB::ColumnFamily> > const*)+0x164e) [0x55ad51fc65de] Apr 18 21:24:37 server2 ceph-osd[17777]: 17: (BlueStore::_open_db(bool, bool)+0xcf4) [0x55ad51f527a4] Apr 18 21:24:37 server2 ceph-osd[17777]: 18: (BlueStore::_mount(bool, bool)+0x4e9) [0x55ad51f828f9] Apr 18 21:24:37 server2 ceph-osd[17777]: 19: (OSD::init()+0x339) [0x55ad51b1cf09] Apr 18 21:24:37 server2 ceph-osd[17777]: 20: (main()+0x23d2) [0x55ad51a00d52] Apr 18 21:24:37 server2 ceph-osd[17777]: 21: (__libc_start_main()+0xf5) [0x7f43bc6c7505] Apr 18 21:24:37 server2 ceph-osd[17777]: 22: (()+0x378c10) [0x55ad51ad8c10] Apr 18 21:24:37 server2 ceph-osd[17777]: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. Apr 18 21:24:37 server2 ceph-osd[17777]: -324> 2020-04-18 21:24:37.090 7f43c936bb80 -1 bluefs _allocate failed to allocate 0x on bdev 2, dne Apr 18 21:24:37 server2 ceph-osd[17777]: -324> 2020-04-18 21:24:37.090 7f43c936bb80 -1 bluefs _flush_range allocated: 0x0 offset: 0x0 length: 0x794de9 Apr 18 21:24:37 server2 ceph-osd[17777]: -324> 2020-04-18 21:24:37.094 7f43c936bb80 -1 /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/13.2.8/rpm/el7/BUILD/ceph-13.2.8/src/os/bluestore/BlueFS.cc: In function 'int BlueFS::_flush_range(BlueFS::FileWriter*, uint64_t, uint64_t)' thread 7f43c936bb80 time 2020-04-18 21:24:37.091289 Apr 18 21:24:37 server2 ceph-osd[17777]: /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/13.2.8/rpm/el7/BUILD/ceph-13.2.8/src/os/bluestore/BlueFS.cc: 1704: FAILED assert(0 == "bluefs enospc") Apr 18 21:24:37 server2 ceph-osd[17777]: ceph version 13.2.8 (5579a94fafbc1f9cc913a0f5d362953a5d9c3ae0) mimic (stable) Apr 18 21:24:37 server2 ceph-osd[17777]: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x14b) [0x7f43c074987b] Apr 18 21:24:37 server2 ceph-osd[17777]: 2: (()+0x26fa07) [0x7f43c0749a07] Apr 18 21:24:37 server2 ceph-osd[17777]: 3: (BlueFS::_flush_range(BlueFS::FileWriter*, unsigned long, unsigned long)+0x1ac6) [0x55ad52029266] Apr 18 21:24:37 server2 ceph-osd[17777]: 4: (BlueRocksWritableFile::Flush()+0x3d) [0x55ad520451bd] Apr 18 21:24:37 server2 ceph-osd[17777]: 5: (rocksdb::WritableFileWriter::Flush()+0x196) [0x55ad521f7916] Apr 18 21:24:37 server2 ceph-osd[17777]: 6: (rocksdb::WritableFileWriter::Sync(bool)+0x2e) [0x55ad521f7bde] Apr 18 21:24:37 server2 ceph-osd[17777]: 7: (rocksdb::BuildTable(std::string const&, rocksdb::Env*, rocksdb::ImmutableCFOptions const&, rocksdb::MutableCFOptions const&, rocksdb::EnvOptions const&, rocksdb::TableCache*, rocksdb::InternalIterator*, std::unique_ptr<rocksdb::InternalIterator, std::default_delete<rocksdb::InternalIterator> >, rocksdb::FileMetaData*, rocksdb::InternalKeyComparator const&, std::vector<std::unique_ptr<rocksdb::IntTblPropCollectorFactory, std::default_delete<rocksdb::IntTblPropCollectorFactory> >, std::allocator<std::unique_ptr<rocksdb::IntTblPropCollectorFactory, std::default_delete<rocksdb::IntTblPropCollectorFactory> > > > const*, unsigned int, std::string const&, std::vector<unsigned long, std::allocator<unsigned long> >, unsigned long, rocksdb::SnapshotChecker*, rocksdb::CompressionType, rocksdb::CompressionOptions const&, bool, rocksdb::InternalStats*, rocksdb::TableFileCreationReason, rocksdb::EventLogger*, int, rocksdb::Env::IOPriority, rocksdb::TableProperties*, int, unsigned long, unsigned long, rocksdb::Env::WriteLifeTimeHint)+0x11d8) [0x55ad5221d5a8] Apr 18 21:24:37 server2 ceph-osd[17777]: 8: (rocksdb::DBImpl::WriteLevel0TableForRecovery(int, rocksdb::ColumnFamilyData*, rocksdb::MemTable*, rocksdb::VersionEdit*)+0xbe6) [0x55ad520b0d76] Apr 18 21:24:37 server2 ceph-osd[17777]: 9: (rocksdb::DBImpl::RecoverLogFiles(std::vector<unsigned long, std::allocator<unsigned long> > const&, unsigned long*, bool)+0x185b) [0x55ad520b2dcb] Apr 18 21:24:37 server2 ceph-osd[17777]: 10: (rocksdb::DBImpl::Recover(std::vector<rocksdb::ColumnFamilyDescriptor, std::allocator<rocksdb::ColumnFamilyDescriptor> > const&, bool, bool, bool)+0xa59) [0x55ad520b3d09] Apr 18 21:24:37 server2 ceph-osd[17777]: 11: (rocksdb::DBImpl::Open(rocksdb::DBOptions const&, std::string const&, std::vector<rocksdb::ColumnFamilyDescriptor, std::allocator<rocksdb::ColumnFamilyDescriptor> > const&, std::vector<rocksdb::ColumnFamilyHandle*, std::allocator<rocksdb::ColumnFamilyHandle*> >*, rocksdb::DB**, bool)+0x689) [0x55ad520b4ab9] Apr 18 21:24:37 server2 ceph-osd[17777]: 12: (rocksdb::DB::Open(rocksdb::DBOptions const&, std::string const&, std::vector<rocksdb::ColumnFamilyDescriptor, std::allocator<rocksdb::ColumnFamilyDescriptor> > const&, std::vector<rocksdb::ColumnFamilyHandle*, std::allocator<rocksdb::ColumnFamilyHandle*> >*, rocksdb::DB**)+0x22) [0x55ad520b62e2] Apr 18 21:24:37 server2 ceph-osd[17777]: 13: (RocksDBStore::do_open(std::ostream&, bool, std::vector<KeyValueDB::ColumnFamily, std::allocator<KeyValueDB::ColumnFamily> > const*)+0x164e) [0x55ad51fc65de] Apr 18 21:24:37 server2 ceph-osd[17777]: 14: (BlueStore::_open_db(bool, bool)+0xcf4) [0x55ad51f527a4] Apr 18 21:24:37 server2 ceph-osd[17777]: 15: (BlueStore::_mount(bool, bool)+0x4e9) [0x55ad51f828f9] Apr 18 21:24:37 server2 ceph-osd[17777]: 16: (OSD::init()+0x339) [0x55ad51b1cf09] Apr 18 21:24:37 server2 ceph-osd[17777]: 17: (main()+0x23d2) [0x55ad51a00d52] Apr 18 21:24:37 server2 ceph-osd[17777]: 18: (__libc_start_main()+0xf5) [0x7f43bc6c7505] Apr 18 21:24:37 server2 ceph-osd[17777]: 19: (()+0x378c10) [0x55ad51ad8c10] Apr 18 21:24:37 server2 ceph-osd[17777]: -324> 2020-04-18 21:24:37.099 7f43c936bb80 -1 *** Caught signal (Aborted) ** Apr 18 21:24:37 server2 ceph-osd[17777]: in thread 7f43c936bb80 thread_name:ceph-osd Apr 18 21:24:37 server2 ceph-osd[17777]: ceph version 13.2.8 (5579a94fafbc1f9cc913a0f5d362953a5d9c3ae0) mimic (stable) Apr 18 21:24:37 server2 ceph-osd[17777]: 1: (()+0xf5f0) [0x7f43bd6bb5f0] Apr 18 21:24:37 server2 ceph-osd[17777]: 2: (gsignal()+0x37) [0x7f43bc6db337] Apr 18 21:24:37 server2 ceph-osd[17777]: 3: (abort()+0x148) [0x7f43bc6dca28] Apr 18 21:24:37 server2 ceph-osd[17777]: 4: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x248) [0x7f43c0749978] Apr 18 21:24:37 server2 ceph-osd[17777]: 5: (()+0x26fa07) [0x7f43c0749a07] Apr 18 21:24:37 server2 ceph-osd[17777]: 6: (BlueFS::_flush_range(BlueFS::FileWriter*, unsigned long, unsigned long)+0x1ac6) [0x55ad52029266] Apr 18 21:24:37 server2 ceph-osd[17777]: 7: (BlueRocksWritableFile::Flush()+0x3d) [0x55ad520451bd] Apr 18 21:24:37 server2 ceph-osd[17777]: 8: (rocksdb::WritableFileWriter::Flush()+0x196) [0x55ad521f7916] Apr 18 21:24:37 server2 ceph-osd[17777]: 9: (rocksdb::WritableFileWriter::Sync(bool)+0x2e) [0x55ad521f7bde] Apr 18 21:24:37 server2 ceph-osd[17777]: 10: (rocksdb::BuildTable(std::string const&, rocksdb::Env*, rocksdb::ImmutableCFOptions const&, rocksdb::MutableCFOptions const&, rocksdb::EnvOptions const&, rocksdb::TableCache*, rocksdb::InternalIterator*, std::unique_ptr<rocksdb::InternalIterator, std::default_delete<rocksdb::InternalIterator> >, rocksdb::FileMetaData*, rocksdb::InternalKeyComparator const&, std::vector<std::unique_ptr<rocksdb::IntTblPropCollectorFactory, std::default_delete<rocksdb::IntTblPropCollectorFactory> >, std::allocator<std::unique_ptr<rocksdb::IntTblPropCollectorFactory, std::default_delete<rocksdb::IntTblPropCollectorFactory> > > > const*, unsigned int, std::string const&, std::vector<unsigned long, std::allocator<unsigned long> >, unsigned long, rocksdb::SnapshotChecker*, rocksdb::CompressionType, rocksdb::CompressionOptions const&, bool, rocksdb::InternalStats*, rocksdb::TableFileCreationReason, rocksdb::EventLogger*, int, rocksdb::Env::IOPriority, rocksdb::TableProperties*, int, unsigned long, unsigned long, rocksdb::Env::WriteLifeTimeHint)+0x11d8) [0x55ad5221d5a8] Apr 18 21:24:37 server2 ceph-osd[17777]: 11: (rocksdb::DBImpl::WriteLevel0TableForRecovery(int, rocksdb::ColumnFamilyData*, rocksdb::MemTable*, rocksdb::VersionEdit*)+0xbe6) [0x55ad520b0d76] Apr 18 21:24:37 server2 ceph-osd[17777]: 12: (rocksdb::DBImpl::RecoverLogFiles(std::vector<unsigned long, std::allocator<unsigned long> > const&, unsigned long*, bool)+0x185b) [0x55ad520b2dcb] Apr 18 21:24:37 server2 ceph-osd[17777]: 13: (rocksdb::DBImpl::Recover(std::vector<rocksdb::ColumnFamilyDescriptor, std::allocator<rocksdb::ColumnFamilyDescriptor> > const&, bool, bool, bool)+0xa59) [0x55ad520b3d09] Apr 18 21:24:37 server2 ceph-osd[17777]: 14: (rocksdb::DBImpl::Open(rocksdb::DBOptions const&, std::string const&, std::vector<rocksdb::ColumnFamilyDescriptor, std::allocator<rocksdb::ColumnFamilyDescriptor> > const&, std::vector<rocksdb::ColumnFamilyHandle*, std::allocator<rocksdb::ColumnFamilyHandle*> >*, rocksdb::DB**, bool)+0x689) [0x55ad520b4ab9] Apr 18 21:24:37 server2 ceph-osd[17777]: 15: (rocksdb::DB::Open(rocksdb::DBOptions const&, std::string const&, std::vector<rocksdb::ColumnFamilyDescriptor, std::allocator<rocksdb::ColumnFamilyDescriptor> > const&, std::vector<rocksdb::ColumnFamilyHandle*, std::allocator<rocksdb::ColumnFamilyHandle*> >*, rocksdb::DB**)+0x22) [0x55ad520b62e2] Apr 18 21:24:37 server2 ceph-osd[17777]: 16: (RocksDBStore::do_open(std::ostream&, bool, std::vector<KeyValueDB::ColumnFamily, std::allocator<KeyValueDB::ColumnFamily> > const*)+0x164e) [0x55ad51fc65de] Apr 18 21:24:37 server2 ceph-osd[17777]: 17: (BlueStore::_open_db(bool, bool)+0xcf4) [0x55ad51f527a4] Apr 18 21:24:37 server2 ceph-osd[17777]: 18: (BlueStore::_mount(bool, bool)+0x4e9) [0x55ad51f828f9] Apr 18 21:24:37 server2 ceph-osd[17777]: 19: (OSD::init()+0x339) [0x55ad51b1cf09] Apr 18 21:24:37 server2 ceph-osd[17777]: 20: (main()+0x23d2) [0x55ad51a00d52] Apr 18 21:24:37 server2 ceph-osd[17777]: 21: (__libc_start_main()+0xf5) [0x7f43bc6c7505] Apr 18 21:24:37 server2 ceph-osd[17777]: 22: (()+0x378c10) [0x55ad51ad8c10] Apr 18 21:24:37 server2 ceph-osd[17777]: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. Apr 18 21:24:37 server2 ceph-osd[17777]: -324> 2020-04-18 21:24:37.090 7f43c936bb80 -1 bluefs _allocate failed to allocate 0x on bdev 2, dne Apr 18 21:24:37 server2 ceph-osd[17777]: -324> 2020-04-18 21:24:37.090 7f43c936bb80 -1 bluefs _flush_range allocated: 0x0 offset: 0x0 length: 0x794de9 Apr 18 21:24:37 server2 ceph-osd[17777]: -324> 2020-04-18 21:24:37.094 7f43c936bb80 -1 /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/13.2.8/rpm/el7/BUILD/ceph-13.2.8/src/os/bluestore/BlueFS.cc: In function 'int BlueFS::_flush_range(BlueFS::FileWriter*, uint64_t, uint64_t)' thread 7f43c936bb80 time 2020-04-18 21:24:37.091289 Apr 18 21:24:37 server2 ceph-osd[17777]: /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/13.2.8/rpm/el7/BUILD/ceph-13.2.8/src/os/bluestore/BlueFS.cc: 1704: FAILED assert(0 == "bluefs enospc") Apr 18 21:24:37 server2 ceph-osd[17777]: ceph version 13.2.8 (5579a94fafbc1f9cc913a0f5d362953a5d9c3ae0) mimic (stable) Apr 18 21:24:37 server2 ceph-osd[17777]: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x14b) [0x7f43c074987b] Apr 18 21:24:37 server2 ceph-osd[17777]: 2: (()+0x26fa07) [0x7f43c0749a07] Apr 18 21:24:37 server2 ceph-osd[17777]: 3: (BlueFS::_flush_range(BlueFS::FileWriter*, unsigned long, unsigned long)+0x1ac6) [0x55ad52029266] Apr 18 21:24:37 server2 ceph-osd[17777]: 4: (BlueRocksWritableFile::Flush()+0x3d) [0x55ad520451bd] Apr 18 21:24:37 server2 ceph-osd[17777]: 5: (rocksdb::WritableFileWriter::Flush()+0x196) [0x55ad521f7916] Apr 18 21:24:37 server2 ceph-osd[17777]: 6: (rocksdb::WritableFileWriter::Sync(bool)+0x2e) [0x55ad521f7bde] Apr 18 21:24:37 server2 ceph-osd[17777]: 7: (rocksdb::BuildTable(std::string const&, rocksdb::Env*, rocksdb::ImmutableCFOptions const&, rocksdb::MutableCFOptions const&, rocksdb::EnvOptions const&, rocksdb::TableCache*, rocksdb::InternalIterator*, std::unique_ptr<rocksdb::InternalIterator, std::default_delete<rocksdb::InternalIterator> >, rocksdb::FileMetaData*, rocksdb::InternalKeyComparator const&, std::vector<std::unique_ptr<rocksdb::IntTblPropCollectorFactory, std::default_delete<rocksdb::IntTblPropCollectorFactory> >, std::allocator<std::unique_ptr<rocksdb::IntTblPropCollectorFactory, std::default_delete<rocksdb::IntTblPropCollectorFactory> > > > const*, unsigned int, std::string const&, std::vector<unsigned long, std::allocator<unsigned long> >, unsigned long, rocksdb::SnapshotChecker*, rocksdb::CompressionType, rocksdb::CompressionOptions const&, bool, rocksdb::InternalStats*, rocksdb::TableFileCreationReason, rocksdb::EventLogger*, int, rocksdb::Env::IOPriority, rocksdb::TableProperties*, int, unsigned long, unsigned long, rocksdb::Env::WriteLifeTimeHint)+0x11d8) [0x55ad5221d5a8] Apr 18 21:24:37 server2 ceph-osd[17777]: 8: (rocksdb::DBImpl::WriteLevel0TableForRecovery(int, rocksdb::ColumnFamilyData*, rocksdb::MemTable*, rocksdb::VersionEdit*)+0xbe6) [0x55ad520b0d76] Apr 18 21:24:37 server2 ceph-osd[17777]: 9: (rocksdb::DBImpl::RecoverLogFiles(std::vector<unsigned long, std::allocator<unsigned long> > const&, unsigned long*, bool)+0x185b) [0x55ad520b2dcb] Apr 18 21:24:37 server2 ceph-osd[17777]: 10: (rocksdb::DBImpl::Recover(std::vector<rocksdb::ColumnFamilyDescriptor, std::allocator<rocksdb::ColumnFamilyDescriptor> > const&, bool, bool, bool)+0xa59) [0x55ad520b3d09] Apr 18 21:24:37 server2 ceph-osd[17777]: 11: (rocksdb::DBImpl::Open(rocksdb::DBOptions const&, std::string const&, std::vector<rocksdb::ColumnFamilyDescriptor, std::allocator<rocksdb::ColumnFamilyDescriptor> > const&, std::vector<rocksdb::ColumnFamilyHandle*, std::allocator<rocksdb::ColumnFamilyHandle*> >*, rocksdb::DB**, bool)+0x689) [0x55ad520b4ab9] Apr 18 21:24:37 server2 ceph-osd[17777]: 12: (rocksdb::DB::Open(rocksdb::DBOptions const&, std::string const&, std::vector<rocksdb::ColumnFamilyDescriptor, std::allocator<rocksdb::ColumnFamilyDescriptor> > const&, std::vector<rocksdb::ColumnFamilyHandle*, std::allocator<rocksdb::ColumnFamilyHandle*> >*, rocksdb::DB**)+0x22) [0x55ad520b62e2] Apr 18 21:24:37 server2 ceph-osd[17777]: 13: (RocksDBStore::do_open(std::ostream&, bool, std::vector<KeyValueDB::ColumnFamily, std::allocator<KeyValueDB::ColumnFamily> > const*)+0x164e) [0x55ad51fc65de] Apr 18 21:24:37 server2 ceph-osd[17777]: 14: (BlueStore::_open_db(bool, bool)+0xcf4) [0x55ad51f527a4] Apr 18 21:24:37 server2 ceph-osd[17777]: 15: (BlueStore::_mount(bool, bool)+0x4e9) [0x55ad51f828f9] Apr 18 21:24:37 server2 ceph-osd[17777]: 16: (OSD::init()+0x339) [0x55ad51b1cf09] Apr 18 21:24:37 server2 ceph-osd[17777]: 17: (main()+0x23d2) [0x55ad51a00d52] Apr 18 21:24:37 server2 ceph-osd[17777]: 18: (__libc_start_main()+0xf5) [0x7f43bc6c7505] Apr 18 21:24:37 server2 ceph-osd[17777]: 19: (()+0x378c10) [0x55ad51ad8c10] Apr 18 21:24:37 server2 ceph-osd[17777]: -324> 2020-04-18 21:24:37.099 7f43c936bb80 -1 *** Caught signal (Aborted) ** Apr 18 21:24:37 server2 ceph-osd[17777]: in thread 7f43c936bb80 thread_name:ceph-osd Apr 18 21:24:37 server2 ceph-osd[17777]: ceph version 13.2.8 (5579a94fafbc1f9cc913a0f5d362953a5d9c3ae0) mimic (stable) Apr 18 21:24:37 server2 ceph-osd[17777]: 1: (()+0xf5f0) [0x7f43bd6bb5f0] Apr 18 21:24:37 server2 ceph-osd[17777]: 2: (gsignal()+0x37) [0x7f43bc6db337] Apr 18 21:24:37 server2 ceph-osd[17777]: 3: (abort()+0x148) [0x7f43bc6dca28] Apr 18 21:24:37 server2 ceph-osd[17777]: 4: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x248) [0x7f43c0749978] Apr 18 21:24:37 server2 ceph-osd[17777]: 5: (()+0x26fa07) [0x7f43c0749a07] Apr 18 21:24:37 server2 ceph-osd[17777]: 6: (BlueFS::_flush_range(BlueFS::FileWriter*, unsigned long, unsigned long)+0x1ac6) [0x55ad52029266] Apr 18 21:24:37 server2 ceph-osd[17777]: 7: (BlueRocksWritableFile::Flush()+0x3d) [0x55ad520451bd] Apr 18 21:24:37 server2 ceph-osd[17777]: 8: (rocksdb::WritableFileWriter::Flush()+0x196) [0x55ad521f7916] Apr 18 21:24:37 server2 ceph-osd[17777]: 9: (rocksdb::WritableFileWriter::Sync(bool)+0x2e) [0x55ad521f7bde] Apr 18 21:24:37 server2 ceph-osd[17777]: 10: (rocksdb::BuildTable(std::string const&, rocksdb::Env*, rocksdb::ImmutableCFOptions const&, rocksdb::MutableCFOptions const&, rocksdb::EnvOptions const&, rocksdb::TableCache*, rocksdb::InternalIterator*, std::unique_ptr<rocksdb::InternalIterator, std::default_delete<rocksdb::InternalIterator> >, rocksdb::FileMetaData*, rocksdb::InternalKeyComparator const&, std::vector<std::unique_ptr<rocksdb::IntTblPropCollectorFactory, std::default_delete<rocksdb::IntTblPropCollectorFactory> >, std::allocator<std::unique_ptr<rocksdb::IntTblPropCollectorFactory, std::default_delete<rocksdb::IntTblPropCollectorFactory> > > > const*, unsigned int, std::string const&, std::vector<unsigned long, std::allocator<unsigned long> >, unsigned long, rocksdb::SnapshotChecker*, rocksdb::CompressionType, rocksdb::CompressionOptions const&, bool, rocksdb::InternalStats*, rocksdb::TableFileCreationReason, rocksdb::EventLogger*, int, rocksdb::Env::IOPriority, rocksdb::TableProperties*, int, unsigned long, unsigned long, rocksdb::Env::WriteLifeTimeHint)+0x11d8) [0x55ad5221d5a8] Apr 18 21:24:37 server2 ceph-osd[17777]: 11: (rocksdb::DBImpl::WriteLevel0TableForRecovery(int, rocksdb::ColumnFamilyData*, rocksdb::MemTable*, rocksdb::VersionEdit*)+0xbe6) [0x55ad520b0d76] Apr 18 21:24:37 server2 ceph-osd[17777]: 12: (rocksdb::DBImpl::RecoverLogFiles(std::vector<unsigned long, std::allocator<unsigned long> > const&, unsigned long*, bool)+0x185b) [0x55ad520b2dcb] Apr 18 21:24:37 server2 ceph-osd[17777]: 13: (rocksdb::DBImpl::Recover(std::vector<rocksdb::ColumnFamilyDescriptor, std::allocator<rocksdb::ColumnFamilyDescriptor> > const&, bool, bool, bool)+0xa59) [0x55ad520b3d09] Apr 18 21:24:37 server2 ceph-osd[17777]: 14: (rocksdb::DBImpl::Open(rocksdb::DBOptions const&, std::string const&, std::vector<rocksdb::ColumnFamilyDescriptor, std::allocator<rocksdb::ColumnFamilyDescriptor> > const&, std::vector<rocksdb::ColumnFamilyHandle*, std::allocator<rocksdb::ColumnFamilyHandle*> >*, rocksdb::DB**, bool)+0x689) [0x55ad520b4ab9] Apr 18 21:24:37 server2 ceph-osd[17777]: 15: (rocksdb::DB::Open(rocksdb::DBOptions const&, std::string const&, std::vector<rocksdb::ColumnFamilyDescriptor, std::allocator<rocksdb::ColumnFamilyDescriptor> > const&, std::vector<rocksdb::ColumnFamilyHandle*, std::allocator<rocksdb::ColumnFamilyHandle*> >*, rocksdb::DB**)+0x22) [0x55ad520b62e2] Apr 18 21:24:37 server2 ceph-osd[17777]: 16: (RocksDBStore::do_open(std::ostream&, bool, std::vector<KeyValueDB::ColumnFamily, std::allocator<KeyValueDB::ColumnFamily> > const*)+0x164e) [0x55ad51fc65de] Apr 18 21:24:37 server2 ceph-osd[17777]: 17: (BlueStore::_open_db(bool, bool)+0xcf4) [0x55ad51f527a4] Apr 18 21:24:37 server2 ceph-osd[17777]: 18: (BlueStore::_mount(bool, bool)+0x4e9) [0x55ad51f828f9] Apr 18 21:24:37 server2 ceph-osd[17777]: 19: (OSD::init()+0x339) [0x55ad51b1cf09] Apr 18 21:24:37 server2 ceph-osd[17777]: 20: (main()+0x23d2) [0x55ad51a00d52] Apr 18 21:24:37 server2 ceph-osd[17777]: 21: (__libc_start_main()+0xf5) [0x7f43bc6c7505] Apr 18 21:24:37 server2 ceph-osd[17777]: 22: (()+0x378c10) [0x55ad51ad8c10] Apr 18 21:24:37 server2 ceph-osd[17777]: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. Apr 18 21:24:37 server2 systemd[1]: ceph-osd(a)4.service: main process exited, code=killed, status=6/ABRT Apr 18 21:24:37 server2 systemd[1]: Unit ceph-osd(a)4.service entered failed state. Apr 18 21:24:37 server2 systemd[1]: ceph-osd(a)4.service failed. Apr 18 21:24:57 server2 systemd[1]: ceph-osd(a)4.service holdoff time over, scheduling restart. Apr 18 21:24:57 server2 systemd[1]: Stopped Ceph object storage daemon osd.4. Apr 18 21:24:57 server2 systemd[1]: Starting Ceph object storage daemon osd.4... Apr 18 21:24:57 server2 systemd[1]: Started Ceph object storage daemon osd.4. and some more data: [root@server1 ~]# ceph df detail GLOBAL: SIZE AVAIL RAW USED %RAW USED OBJECTS 38 GiB 35 GiB 2.6 GiB 6.83 303 POOLS: NAME ID QUOTA OBJECTS QUOTA BYTES USED %USED MAX AVAIL OBJECTS DIRTY READ WRITE RAW USED .rgw.root 18 N/A N/A 1.1 KiB 0 50 GiB 4 4 63 B 4 B 1.1 KiB default.rgw.meta 19 N/A N/A 4.1 KiB 0 50 GiB 22 22 1.5 KiB 182 B 9.0 KiB default.rgw.log 20 N/A N/A 0 B 0 50 GiB 107 107 307 KiB 205 KiB 0 B default.rgw.control 21 N/A N/A 0 B 0 50 GiB 5 5 0 B 0 B 0 B default.rgw.buckets.index 22 N/A N/A 0 B 0 50 GiB 5 5 501 B 60 B 0 B default.rgw.buckets.data 23 N/A N/A 377 MiB 0.39 50 GiB 160 160 385 B 1.1 KiB 603 MiB [root@server1 ~]# [root@server1 ~]# ceph osd df tree ID CLASS WEIGHT REWEIGHT SIZE USE DATA OMAP META AVAIL %USE VAR PGS TYPE NAME -1 0.16727 - 0 B 0 B 0 B 0 B 0 B 0 B 0 0 - root default -3 0.05576 - 0 B 0 B 0 B 0 B 0 B 0 B 0 0 - host server1 0 hdd 0.01859 1.00000 0 B 0 B 0 B 0 B 0 B 0 B 0 0 0 osd.0 1 hdd 0.01859 0 0 B 0 B 0 B 0 B 0 B 0 B 0 0 0 osd.1 2 hdd 0.01859 0 0 B 0 B 0 B 0 B 0 B 0 B 0 0 0 osd.2 -5 0.05576 - 19 GiB 1.4 GiB 360 MiB 3 KiB 1024 MiB 18 GiB 0 0 - host server2 3 hdd 0.01859 1.00000 0 B 0 B 0 B 0 B 0 B 0 B 0 0 0 osd.3 4 hdd 0.01859 0 0 B 0 B 0 B 0 B 0 B 0 B 0 0 0 osd.4 5 hdd 0.01859 1.00000 19 GiB 1.4 GiB 360 MiB 3 KiB 1024 MiB 18 GiB 7.11 1.04 99 osd.5 -7 0.05576 - 0 B 0 B 0 B 0 B 0 B 0 B 0 0 - host server3 6 hdd 0.01859 1.00000 19 GiB 1.2 GiB 249 MiB 3 KiB 1024 MiB 18 GiB 6.55 0.96 78 osd.6 7 hdd 0.01859 1.00000 0 B 0 B 0 B 0 B 0 B 0 B 0 0 0 osd.7 8 hdd 0.01859 1.00000 0 B 0 B 0 B 0 B 0 B 0 B 0 0 0 osd.8 TOTAL 38 GiB 2.6 GiB 610 MiB 6 KiB 2.0 GiB 35 GiB 6.83 MIN/MAX VAR: 0/1.04 STDDEV: 5.58 [root@server1 ~]# I'm kind of newbie to ceph, so any help or hint would be appreciated. Did I hit a bug or something is wrong with my configuration? Thanks a lot, Khodayar

3 years, 8 months

5
5
0 0

Help

by Randy Morgan

We are seeking information on configuring Ceph to work with Noobaa and NextCloud. Randy -- Randy Morgan CSR Department of Chemistry/BioChemistry Brigham Young University randym(a)chem.byu.edu

3 years, 8 months

3
2
0 0

Ceph Nautilus packages for Ubuntu Focal

by Stefan Kooman

Hi list, We're wondering if Ceph Nautilus packages will be provided for Ubuntu Focal Fossa (20.04)? You might wonder why one would not just use Ubuntu Bionic (18.04) instead of using the latest LTS. Here is why: a glibc bug in Ubuntu Bionic that *might* affect Open vSwitch (OVS) users [1]. We had quite a few issues with OVS deadlocks on hypervisors, and do not want to risk experiencing the same issues on our Ceph cluster(s). I'm not sure how many of you use OVS for bridging / bonding, but for those who do, running Ceph (Nautlilus / Octopus) on 20.04 would be preferred. Gr. Stefan [1]: https://bugs.launchpad.net/ubuntu/+source/openvswitch/+bug/1839592

3 years, 8 months

2
1
0 0

Ceph influxDB support versus Telegraf Ceph plugin?

by victorhooi＠yahoo.com

Hi, I've read that Ceph has some InfluxDB reporting capabilities inbuilt (https://docs.ceph.com/docs/master/mgr/influx/). However, Telegraf, which is the system reporting daemon for InfluxDB, also has a Ceph plugin (https://github.com/influxdata/telegraf/tree/master/plugins/inputs/ceph). Just curious what people's thoughts on the two are, or what they are using in production? Which is easier to deploy/maintain, have you found? Or more useful for alerting, or tracking performance gremlins? Thanks, Victor

3 years, 8 months

2
2
0 0

IPv6 connectivity gone for Ceph Telemetry

by Wido den Hollander

Hi, I was just checking on a few (13) IPv6-only Ceph clusters and I noticed that they couldn't send their Telemetry data anymore: telemetry.ceph.com has address 8.43.84.137 This server used to have Dual-Stack connectivity while it was still hosted at OVH. It seemed to have moved to Red Hat, but lost IPv6 connectivity. How can we get this back? Wido

3 years, 9 months

2
1
0 0

Problem with OSD::osd_op_tp thread had timed out and other connected issues

by Jan Pekař - Imatic

Hello, I have ceph cluster version 14.2.7 (3d58626ebeec02d8385a4cefb92c6cbc3a45bfe8) nautilus (stable) 4 nodes - each node 11 HDD, 1 SSD, 10Gbit network Cluster was empty, fresh install. We filled cluster with data (small blocks) using RGW. Cluster is now used for testing so no client was using it during my admin operations mentioned below After a while (7TB of data / 40M objects uploaded) we decided, that we increase pg_num from 128 to 256 to better spread data and to speedup this operation, I've set ceph config set mgr target_max_misplaced_ratio 1 so that whole cluster rebalance as quickly as it can. I have 3 issues/questions below: 1) I noticed, that manual increase from 128 to 256 caused approx. 6 OSD's to restart with logged heartbeat_map clear_timeout 'OSD::osd_op_tp thread 0x7f8c84b8b700' had suicide timed out after 150 after a while OSD's were back so I continued after a while with my tests. My question - increasing number of PG with maximal target_max_misplaced_ratio was too much for that OSDs? It is not recommended to do it this way? I had no problem with this increase before, but configuration of cluster was slightly different and it was luminous version. 2) Rebuild was still slow so I increased number of backfills ceph tell osd.* injectargs "--osd-max-backfills 10" and reduced recovery sleep time ceph tell osd.* injectargs "--osd-recovery-sleep-hdd 0.01" and after few hours I noticed, that some of my OSD's were restarted during recovery, in log I can see ... |2020-03-21 06:41:28.343 7fe1f8bee700 1 heartbeat_map is_healthy 'OSD::osd_op_tp thread 0x7fe1da154700' had timed out after 15 2020-03-21 06:41:28.343 7fe1f8bee700 1 heartbeat_map is_healthy 'OSD::osd_op_tp thread 0x7fe1da154700' had timed out after 15 2020-03-21 06:41:36.780 7fe1da154700 1 heartbeat_map clear_timeout 'OSD::osd_op_tp thread 0x7fe1da154700' had timed out after 15 2020-03-21 06:41:36.888 7fe1e7769700 0 log_channel(cluster) log [WRN] : Monitor daemon marked osd.7 down, but it is still running 2020-03-21 06:41:36.888 7fe1e7769700 0 log_channel(cluster) log [DBG] : map e3574 wrongly marked me down at e3573 2020-03-21 06:41:36.888 7fe1e7769700 1 osd.7 3574 start_waiting_for_healthy | I observed network graph usage and network utilization was low during recovery (10Gbit was not saturated). So lot of IOPS on OSD causes also hartbeat operation to timeout? I thought that OSD is using threads and HDD timeouts are not influencing heartbeats to other OSD's and MON. It looks like it is not true. 3) After OSD was wrongly marked down I can see that cluster has object degraded. There were no degraded object before that. Degraded data redundancy: 251754/117225048 objects degraded (0.215%), 8 pgs degraded, 8 pgs undersized It means that this OSD disconnection causes data degraded? How is it possible, when no OSD was lost. Data should be on that OSD and after peering should be everything OK. With luminous I had no problem, after OSD up degraded objects where recovered/found during few seconds and cluster was healthy within seconds. Thank you very much for additional info. I can perform additional tests you recommend because cluster is used for testing purpose now. With regards Jan Pekar -- ============ Ing. Jan Pekař jan.pekar(a)imatic.cz ---- Imatic | Jagellonská 14 | Praha 3 | 130 00 http://www.imatic.cz | +420326555326 ============ --

3 years, 9 months

4
7
0 0

bluestore_default_buffered_write = true

by Adam Koczarski

Has anyone ever tried using this feature? I've added it to the [global] section of the ceph.conf on my POC cluster but I'm not sure how to tell if it's actually working. I did find a reference to this feature via Google and they had it in their [OSD] section?? I've tried that too.. TIA Adam

3 years, 9 months

2
1
0 0

Poor Windows performance on ceph RBD.

by Frank Schilder

Dear all, maybe someone can give me a pointer here. We are running OpenNebula with ceph RBD as a back-end store. We have a pool of spinning disks to create large low-demand data disks, mainly for backups and other cold storage. Everything is fine when using linux VMs. However, Windows VMs perform poorly, they are like a factor 20 slower than a similarly created linux VM. If anyone has pointers what to look for, we would be very grateful. The OpenNebula installation is more or less default. The current OS and libvirt versions we use are: Centos 7.6 with stock kernel 3.10.0-1062.1.1.el7.x86_64 libvirt-client.x86_64 4.5.0-23.el7_7.1 @updates qemu-kvm-ev.x86_64 10:2.12.0-33.1.el7 @centos-qemu-ev Some benchmark results from good to worse workloads: rbd bench --io-size 4M --io-total 4G --io-pattern seq --io-type write --io-threads 16 : 450MB/s rbd bench --io-size 4M --io-total 4G --io-pattern seq --io-type write --io-threads 1 : 230MB/s rbd bench --io-size 1M --io-total 4G --io-pattern seq --io-type write --io-threads 1 : 190MB/s rbd bench --io-size 64K --io-total 4G --io-pattern seq --io-type write --io-threads 1 : 150MB/s rbd bench --io-size 64K --io-total 1G --io-pattern rand --io-type write --io-threads 1 : 26MB/s dd with conv=fdatasync gives awesome 500MB/s inside linux VM for sequential write of 4GB. We copied a couple of large ISO files inside the Windows VM and for the first ca. 1 to 1.5G it performs as expected. Thereafter, however, write speed drops rapidly to ca. 25MB/s and does not recover. It is almost as if Windows translates large sequential writes to small random writes. If anyone has seen and solved this before, please let us know. Thanks and best regards, ================= Frank Schilder AIT Risø Campus Bygning 109, rum S14

3 years, 9 months

5
9
0 0

OSDs taking too much memory, for buffer_anon

by Harald Staub

As a follow-up to our recent memory problems with OSDs (with high pglog values: https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/LJPJZPBSQRJ… ), we also see high buffer_anon values. E.g. more than 4 GB, with "osd memory target" set to 3 GB. Is there a way to restrict it? As it is called "anon", I guess that it would first be necessary to find out what exactly is behind this? Well maybe it is just as Wido said, with lots of small objects, there will be several problems. Cheers Harry

3 years, 9 months

4
8
0 0

ceph qos

by 展荣臻（信泰）

Hi everyone: There are two types of qos in ceph(one based on tokenbucket algorithm,another based on mclock ). Which one I can use in nautilus production environment ？Thank you

3 years, 9 months

2
1
0 0

2024

2023

2022

2021

2020

2019

ceph-users May 2020