This output seems typical for both active MDS servers:
---------------mds---------------- --mds_cache--- ------mds_log------
-mds_mem- -------mds_server------- mds_ -----objecter------ purg
req rlat fwd inos caps exi imi |stry recy recd|subm evts segs repl|ino dn
|hcr hcs hsr cre cat |sess|actv rd wr rdwr|purg|
0 0 0 6.0M 887k 1.0k 0 | 56 0 0 | 7 3.0k 139 0 |6.0M
6.0M| 0 154 0 0 0 | 48 | 0 0 14 0 | 0
0 11k 0 6.0M 887k 236 0 | 56 0 0 | 7 3.0k 142 0 |6.0M
6.0M| 0 99 0 0 0 | 48 | 1 0 31 0 | 0
0 0 0 6.0M 887k 718 0 | 56 0 0 | 5 3.0k 143 0 |6.0M
6.0M| 0 318 0 0 0 | 48 | 1 0 12 0 | 0
0 13k 0 6.0M 887k 3.4k 0 | 56 0 0 |197 3.2k 145 0 |6.0M
6.0M| 0 43 1 0 0 | 48 | 8 0 207 0 | 0
0 0 0 6.0M 884k 4.9k 0 | 56 0 0 | 0 3.2k 145 0 |6.0M
6.0M| 0 2 0 0 0 | 48 | 0 0 10 0 | 0
0 0 0 6.0M 884k 2.1k 0 | 56 0 0 | 6 3.2k 147 0 |6.0M
6.0M| 0 0 1 0 0 | 48 | 0 0 12 0 | 0
2 0 0 6.0M 882k 1.1k 0 | 56 0 0 | 75 3.3k 150 0 |6.0M
6.0M| 2 23 0 0 0 | 48 | 0 0 42 0 | 0
0 0 0 6.0M 880k 16 0 | 56 0 0 | 88 3.4k 152 0 |6.0M
6.0M| 0 48 0 0 0 | 48 | 3 0 115 0 | 0
1 2.4k 0 6.0M 878k 126 0 | 56 0 0 |551 2.8k 130 0 |6.0M
6.0M| 1 26 2 0 0 | 48 | 0 0 209 0 | 0
4 210 0 6.0M 874k 0 0 | 56 0 0 | 5 2.8k 131 0 |6.0M
6.0M| 4 14 0 0 0 | 48 | 0 0 488 0 | 0
1 891 0 6.0M 870k 12k 0 | 56 0 0 | 0 2.8k 131 0 |6.0M
6.0M| 1 33 0 0 0 | 48 | 0 0 0 0 | 0
5 15 2 6.0M 870k 8.2k 0 | 56 0 0 | 79 2.9k 134 0 |6.0M
6.0M| 5 27 1 0 0 | 48 | 0 0 22 0 | 0
1 68 0 6.0M 858k 0 0 | 56 0 0 | 49 2.9k 136 0 |6.0M
6.0M| 1 0 1 0 0 | 48 | 0 0 91 0 | 0
The metadata pool is still taking 64 MB/s writes.
We have two active MDS servers, without pinning.
mds_cache_memory_limit is set to 20 GB, which ought to be enough for
anyone(tm) as only 24 GB of data is used in the metadata pool.
Does that offer any kind of clue?
On Thu, 8 Jul 2021 at 10:16, Dan van der Ster <dan(a)vanderster.com> wrote:
Hi,
That's interesting -- yes on a lightly loaded cluster the metadata IO
should be almost nil.
You can debug what is happening using ceph daemonperf on the active
MDS, e.g.
https://pastebin.com/raw/n0iD8zXY
(Use a wide terminal to show all the columns).
Normally, lots of md io would indicate that the cache size is too
small for the workload; but since you said the clients are pretty
idle, this might not be the case for you.
Cheers, Dan
On Thu, Jul 8, 2021 at 9:36 AM Flemming Frandsen <dren.dk(a)gmail.com>
wrote:
We have a nautilus cluster where any metadata write operation is very
slow.
We're seeing very light load from clients, as reported by dumping ops in
flight, often it's zero.
We're also seeing about 100 MB/s writes to the metadata pool, constantly,
for weeks on end, which seems excessive, as only 22GB is utilized.
Should the writes to the metadata pool not quiet down when there's
nothing
going on?
Is there any way i can get information about why the MDSes are thrashing
so
badly?
_______________________________________________
ceph-users mailing list -- ceph-users(a)ceph.io
To unsubscribe send an email to ceph-users-leave(a)ceph.io