Also seeing errors such as this:
[2020-05-01 13:15:20,970][systemd][WARNING] command returned non-zero exit
status: 1
[2020-05-01 13:15:20,970][systemd][WARNING] failed activating OSD, retries
left: 11
[2020-05-01 13:15:20,974][ceph_volume.process][INFO ] stderr -->
RuntimeError: could not find osd.13 with osd_fsid
dd49cd80-418e-4a8c-8ebf-a33d339663ff
[2020-05-01 13:15:20,989][systemd][WARNING] command returned non-zero exit
status: 1
[2020-05-01 13:15:20,989][systemd][WARNING] failed activating OSD, retries
left: 11
[2020-05-01 13:15:20,998][ceph_volume.process][INFO ] stderr -->
RuntimeError: could not find osd.5 with osd_fsid
4eaf2baa-60f2-4045-8964-6152608c742a
[2020-05-01 13:15:21,014][systemd][WARNING] command returned non-zero exit
status: 1
[2020-05-01 13:15:21,014][systemd][WARNING] failed activating OSD, retries
left: 11
[2020-05-01 13:15:21,019][ceph_volume.process][INFO ] stderr -->
RuntimeError: could not find osd.9 with osd_fsid
32f4a716-f26e-4579-a074-5d6452c22e34
[2020-05-01 13:15:21,035][systemd][WARNING] command returned non-zero exit
status: 1
[2020-05-01 13:15:21,035][systemd][WARNING] failed activating OSD, retries
left: 11
[2020-05-01 13:15:25,972][ceph_volume.process][INFO ] Running command:
/usr/sbin/ceph-volume lvm trigger 1-0f0e6dd7-9dd8-4b48-beaa-084f55f73b32
[2020-05-01 13:15:25,994][ceph_volume.process][INFO ] Running command:
/usr/sbin/ceph-volume lvm trigger 13-dd49cd80-418e-4a8c-8ebf-a33d339663ff
[2020-05-01 13:15:26,020][ceph_volume.process][INFO ] Running command:
/usr/sbin/ceph-volume lvm trigger 5-4eaf2baa-60f2-4045-8964-6152608c742a
[2020-05-01 13:15:26,040][ceph_volume.process][INFO ] Running command:
/usr/sbin/ceph-volume lvm trigger 9-32f4a716-f26e-4579-a074-5d6452c22e34
[2020-05-01 13:15:26,388][ceph_volume.process][INFO ] stderr -->
RuntimeError: could not find osd.1 with osd_fsid
0f0e6dd7-9dd8-4b48-beaa-084f55f73b32
[2020-05-01 13:15:26,389][ceph_volume.process][INFO ] stderr -->
RuntimeError: could not find osd.13 with osd_fsid
dd49cd80-418e-4a8c-8ebf-a33d339663ff
[2020-05-01 13:15:26,391][ceph_volume.process][INFO ] stderr -->
RuntimeError: could not find osd.5 with osd_fsid
4eaf2baa-60f2-4045-8964-6152608c742a
[2020-05-01 13:15:26,402][systemd][WARNING] command returned non-zero exit
status: 1
[2020-05-01 13:15:26,403][systemd][WARNING] failed activating OSD, retries
left: 10
[2020-05-01 13:15:26,403][systemd][WARNING] command returned non-zero exit
status: 1
[2020-05-01 13:15:26,404][systemd][WARNING] failed activating OSD, retries
left: 10
[2020-05-01 13:15:26,404][systemd][WARNING] command returned non-zero exit
status: 1
[2020-05-01 13:15:26,405][systemd][WARNING] failed activating OSD, retries
left: 10
[2020-05-01 13:15:26,411][ceph_volume.process][INFO ] stderr -->
RuntimeError: could not find osd.9 with osd_fsid
32f4a716-f26e-4579-a074-5d6452c22e34
[2020-05-01 13:15:26,424][systemd][WARNING] command returned non-zero exit
status: 1
[2020-05-01 13:15:26,424][systemd][WARNING] failed activating OSD, retries
left: 10
[2020-05-01 13:15:31,408][ceph_volume.process][INFO ] Running command:
/usr/sbin/ceph-volume lvm trigger 1-0f0e6dd7-9dd8-4b48-beaa-084f55f73b32
[2020-05-01 13:15:31,408][ceph_volume.process][INFO ] Running command:
/usr/sbin/ceph-volume lvm trigger 5-4eaf2baa-60f2-4045-8964-6152608c742a
[2020-05-01 13:15:31,409][ceph_volume.process][INFO ] Running command:
/usr/sbin/ceph-volume lvm trigger 13-dd49cd80-418e-4a8c-8ebf-a33d339663ff
[2020-05-01 13:15:31,429][ceph_volume.process][INFO ] Running command:
/usr/sbin/ceph-volume lvm trigger 9-32f4a716-f26e-4579-a074-5d6452c22e34
[2020-05-01 13:15:31,743][ceph_volume.process][INFO ] stderr -->
RuntimeError: could not find osd.5 with osd_fsid
4eaf2baa-60f2-4045-8964-6152608c742a
[2020-05-01 13:15:31,750][ceph_volume.process][INFO ] stderr -->
RuntimeError: could not find osd.13 with osd_fsid
dd49cd80-418e-4a8c-8ebf-a33d339663ff
[2020-05-01 13:15:31,752][systemd][WARNING] command returned non-zero exit
status: 1
[2020-05-01 13:15:31,752][systemd][WARNING] failed activating OSD, retries
left: 9
[2020-05-01 13:15:31,754][ceph_volume.process][INFO ] stderr -->
RuntimeError: could not find osd.1 with osd_fsid
0f0e6dd7-9dd8-4b48-beaa-084f55f73b32
[2020-05-01 13:15:31,761][systemd][WARNING] command returned non-zero exit
status: 1
[2020-05-01 13:15:31,762][systemd][WARNING] failed activating OSD, retries
left: 9
[2020-05-01 13:15:31,764][systemd][WARNING] command returned non-zero exit
status: 1
[2020-05-01 13:15:31,765][systemd][WARNING] failed activating OSD, retries
left: 9
On Fri, May 1, 2020 at 2:23 PM Marco Pizzolo <marcopizzolo(a)gmail.com> wrote:
Hi Ashley,
Thanks for your response. Nothing that I can think of would have
happened. We are using max_mds =1. We do have 4 so used to have 3
standby. Within minutes they all crash.
On Fri, May 1, 2020 at 2:21 PM Ashley Merrick <singapore(a)amerrick.co.uk>
wrote:
Quickly checking the code that calls that assert
if (version > omap_version) {
omap_version = version;
omap_num_objs = num_objs;
omap_num_items.resize(omap_num_objs);
journal_state = jstate;
} else if (version == omap_version) {
ceph_assert(omap_num_objs == num_objs);
if (jstate > journal_state)
journal_state = jstate;
}
}
Im not a dev, but not sure if this will help, seems could mean that MDS
thinks its behind on omaps/too far ahead.
Anything happened recently? Just running a single MDS?
Hopefully someone else may see this and shine some light on what could be
causing it.
---- On Sat, 02 May 2020 02:10:58 +0800 marcopizzolo(a)gmail.com wrote ----
Hello,
Hoping you can help me.
Ceph had been largely problem free for us for the better part of a year.
We have a high file count in a single CephFS filesystem, and are seeing
this error in the logs:
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/gigantic/release/14.2.9/rpm/el7/BUILD/ceph-14.2.9/src/mds/OpenFileTable.cc:
777: FAILED ceph_assert(omap_num_objs == num_objs)
The issued seemed to occur this morning, and restarting the MDS as well as
rebooting the servers doesn't correct the problem.
Not really sure where to look next as the MDS daemons crash.
Appreciate any help you can provide
Marco
_______________________________________________
ceph-users mailing list -- ceph-users(a)ceph.io
To unsubscribe send an email to ceph-users-leave(a)ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users(a)ceph.io
To unsubscribe send an email to ceph-users-leave(a)ceph.io