Hello,
in my cluster one after the other OSD dies until I recognized that it
was simply an "abort" in the daemon caused probably by
2020-01-31 15:54:42.535930 7faf8f716700 -1 log_channel(cluster) log
[ERR] : trim_object Snap 29c44 not in clones
Close to this msg I get a stracktrace:
ceph version 0.94.10 (b1e0532418e4631af01acbc0cedd426f1905f4af)
1: /usr/bin/ceph-osd() [0xb35f7d]
2: (()+0x11390) [0x7f0fec74b390]
3: (gsignal()+0x38) [0x7f0feab43428]
4: (abort()+0x16a) [0x7f0feab4502a]
5: (__gnu_cxx::__verbose_terminate_handler()+0x16d) [0x7f0feb48684d]
6: (()+0x8d6b6) [0x7f0feb4846b6]
7: (()+0x8d701) [0x7f0feb484701]
8: (()+0x8d919) [0x7f0feb484919]
9: (ceph::__ceph_assert_fail(char const*, char const*, int, char
const*)+0x27e) [0xc3776e]
10: (ReplicatedPG::eval_repop(ReplicatedPG::RepGather*)+0x10dd) [0x868cfd]
11: (ReplicatedPG::repop_all_committed(ReplicatedPG::RepGather*)+0x80)
[0x8690e0]
12: (Context::complete(int)+0x9) [0x6c8799]
13: (void ReplicatedBackend::sub_op_modify_reply<MOSDRepOpReply,
113>(std::tr1::shared_ptr<OpRequest>)+0x21b) [0xa5ae0b]
14:
(ReplicatedBackend::handle_message(std::tr1::shared_ptr<OpRequest>)+0x15b)
[0xa53edb]
15: (ReplicatedPG::do_request(std::tr1::shared_ptr<OpRequest>&,
ThreadPool::TPHandle&)+0x1cb) [0x84c78b]
16: (OSD::dequeue_op(boost::intrusive_ptr<PG>,
std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x3ef) [0x6966ff]
17: (OSD::ShardedOpWQ::_process(unsigned int,
ceph::heartbeat_handle_d*)+0x4e4) [0x696e14]
18: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x71e)
[0xc264fe]
19: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0xc29950]
20: (()+0x76ba) [0x7f0fec7416ba]
21: (clone()+0x6d) [0x7f0feac1541d]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is
needed to interpret this.
Yes, I know it's still hammer, I want to upgrade soon, but I want to
resolve that issue first. If I lose that PG, I don't worry.
So: What it the best approach? Can I use something like
ceph-objectstore-tool ... <object> remove-clone-metadata <cloneid> ? I
assume 29c44 is my Object, but what's the clone od?
Best regards,
derjohn