Hi,
as a follow-up:
* a full log of one OSD failing to start
https://pastebin.com/T8UQ2rZ6
* our ec-pool cration in the fist place
https://pastebin.com/20cC06Jn
* ceph osd dump and ceph osd erasure-code-profile get cephfs
https://pastebin.com/TRLPaWcH
as we try to dig more into it, it looks like a bug in the cephfs or
erasure-coding part of ceph.
Ansgar
Am Di., 6. Aug. 2019 um 14:50 Uhr schrieb Ansgar Jazdzewski
<a.jazdzewski(a)googlemail.com>om>:
hi folks,
we had to move one of our clusters so we had to boot all servers, now
we found an Error on all OSD with the EC-Pool.
do we miss some opitons, will an upgrade to 13.2.6 help?
Thanks,
Ansgar
2019-08-06 12:10:16.265 7fb337b83200 -1
/build/ceph-13.2.4/src/osd/ECUtil.h: In function
'ECUtil::stripe_info_t::stripe_info_t(uint64_t, uint64_t)' thread
7fb337b83200 time 2019-08-06 12:10:16.263025
/build/ceph-13.2.4/src/osd/ECUtil.h: 34: FAILED assert(stripe_width %
stripe_size == 0)
ceph version 13.2.4 (b10be4d44915a4d78a8e06aa31919e74927b142e) mimic
(stable) 1: (ceph::ceph_assert_fail(char const, char const, int, char
const)+0x102) [0x7fb32eeb83c2] 2: (()+0x2e5587) [0x7fb32eeb8587] 3:
(ECBackend::ECBackend(PGBackend::Listener, coll_t const&,
boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ObjectStore,
CephContext, std::shared_ptr<ceph::ErasureCodeInterface>, unsigned
long)+0x4de) [0xa4cbbe] 4: (PGBackend::build_pg_backend(pg_pool_t
const&, std::map<std::cxx11::basic_string<char,
std::char_traits<char>, std::allocator<char> >,
std::cxx11::basic_string<char, std::char_traits<char>,
std::allocator<char> >, std::less<std::cxx11::basic_string<char,
std::char_traits<char>, std::allocator<char> > >, std
::allocator<std::pair<std::__cxx11::basic_string<char,
std::char_traits<char>, std::allocator<char> > const,
std::cxx11::basic_string<char, std::char_traits<char>,
std::allocator<char> > > > > const&, PGBackend::Listener, coll_t,
boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ObjectStore,
CephContext)+0x2f9 ) [0x9474e9] 5:
(PrimaryLogPG::PrimaryLogPG(OSDService, std::shared_ptr<OSDMap const>,
PGPool const&, std::map<std::cxx11::basic_string<char,
std::char_traits<char>, std::allocator<char> >,
std::cxx11::basic_string<char, std::char_traits<char>,
std::allocator<char> >, std::less<std::cxx11::basic_string<char,
std::char_tra its<char>, std::allocator<char> > >,
std::allocator<std::pair<std::__cxx11::basic_string<char,
std::char_traits<char>, std::allocator<char> > const,
std::cxx11::basic_string<char, std::char_traits<char>,
std::allocator<char> > > > > const&, spg_t)+0x138) [0x8f96e8] 6:
(OSD::_make_pg(std::shared_ptr<OSDMap const>, spg_t)+0x11d3)
[0x753553] 7: (OSD::load_pgs()+0x4a9) [0x758339] 8:
(OSD::init()+0xcd3) [0x7619c3] 9: (main()+0x3678) [0x64d6a8] 10:
(libc_start_main()+0xf0) [0x7fb32ca68830] 11: (_start()+0x29)
[0x717389] NOTE: a copy of the executable, or objdump -rdS
<executable> is needed to interpret this.