Thanks,
I’ll try adjusting mds_cache_memory_limit. I did get some messages about MDS being slow
trimming the cache, which implies that it was over its cache size.
I never had any problems with the kernel mount, fortunately. I am running 17.2.5
(Quincy)
My metadata pool size if about 15GB with a data pool of 170 TB stored / 85M objects.
What switch did you do? Metadata to SSD, or increased mds_cache_memory_limit?
George
On Mar 27, 2023, at 1:58 PM, Marc
<Marc(a)f1-outsourcing.eu> wrote:
We have a ceph cluster (Proxmox based) with is HDD-based. We’ve had
some performance and “slow MDS” issues while doing VM/CT backups from
the Proxmox cluster, especially when rebalancing is going on at the same
time.
I also had to increase the mds cache quite a lot to get rid of 'slow' issues,
ds_cache_memory_limit =
But after some Luminous(?) update I started using less the cephfs, because of issues with
the kernel mount.
My thought is that one of following is going to
improve performance /
response:
1. Add an M.2 drive for DB store on each node
2. Migrate the cephfs metadata pool to SSDs
We have ~25 nodes with ~3 OSDs per node.
(1) is a lot of work and will cost more.
(2) seems more risky (to me) since the metadata pool would have to be
migrated (potential loss in transit?)
Can't remember running into issues having done this switch quite a while ago. Still
have only 7GB on ssd meta data pool vs 45TB / 13kk objects.