New subject: [Suspicious newsletter] Weird performance issue with long heartbeat and slow ops warnings

8 Oct 2020

Hello,

I have a ceph cluster running 14.2.11. I am running benchmark tests with
FIO concurrently on ~2000 volumes of 10G each. During the time initial
warm-up FIO creates a 10G file on each volume before it runs the actual
read/write I/O operations. During this time, I start seeing the Ceph
cluster reporting about 35GiB/s write throughput for a while, but after
some time I start seeing "long heartbeat" and "slow ops" warnings and
in a
few mins the throughput drops to ~1GB/s and stays there until all FIO runs
complete.

The cluster has 5 monitor nodes and 10 data nodes - each with 10x3.2TB NVME
drives. I have setup 3 OSD for each NVME, so there are a total of 300 OSDs.
Each server has 200GB uplink and there's no apparent network bottleneck as
the network is set up to support over 1Tbps bandwidth. I dont see any CPU
or memory issues also on the servers.

There is a single manager instance running on one of the mons.

The pool is configured for 3 replication factor with min_size of 2. I tried
to use pg_num of 8192 and 16384 and saw the issue with both settings.

Could you please suggest if this is a known issue or if I can tune any
parameters?

           Long heartbeat ping times on back interface seen, longest is
1202.120 msec
            Long heartbeat ping times on front interface seen, longest is
1535.191 msec
            35 slow ops, oldest one blocked for 122 sec, daemons
[osd.135,osd.14,osd.141,osd.143,osd.149,osd.15,osd.151,osd.153,osd.157,osd.162]...
have slow ops.

Regards,
Shridhar