On 11/09/2019 04:14, Yan, Zheng wrote:
On Wed, Sep 11, 2019 at 6:51 AM Kenneth Waegeman
<kenneth.waegeman(a)ugent.be> wrote:
We sync the file system without preserving hard
links. But we take
snapshots after each sync, so I guess deleting files which are still in
snapshots can also be in the stray directories?
[root@mds02 ~]# ceph daemon mds.mds02 perf dump | grep -i 'stray\|purge'
"finisher-PurgeQueue": {
"num_strays": 990153,
"num_strays_delayed": 32,
"num_strays_enqueuing": 0,
"strays_created": 753278,
"strays_enqueued": 650603,
"strays_reintegrated": 0,
"strays_migrated": 0,
num_strays is indeed close to a million
The issue is related to snapshot. snap inodes stray in stray
directory. I suggest deleting some old snapshots
We only have a few snapshots, and they are not very old :) But deleting
a few, waiting for the trim and restarting mds's reduced the num strays,
so this fixes it temporary.
I've also made a ticket
https://tracker.ceph.com/issues/41778
Thanks!
Kenneth
>
>> On 10/09/2019 12:42, Burkhard Linke wrote:
>>> Hi,
>>>
>>>
>>> do you use hard links in your workload? The 'no space left on
device'
>>> message may also refer to too many stray files. Strays are either
>>> files that are to be deleted (e.g. the purge queue), but also files
>>> which are deleted, but hard links are still pointing to the same
>>> content. Since cephfs does not use an indirect layer between inodes
>>> and data, and the data chunks are named after the inode id, removing
>>> the original file will leave stray entries since cephfs is not able to
>>> rename the underlying rados objects.
>>>
>>>
>>> There are 10 hidden directories for stray files, and given a maximum
>>> size of 100.000 entries you can store only up to 1 million entries. I
>>> don't know exactly how entries are distributed among the 10
>>> directories, so the limit may be reached earlier for a single stray
>>> directory. The performance counters contains some values for stray, so
>>> they are easy to check. The daemonperf output also shows the current
>>> value.
>>>
>>>
>>> The problem of the upper limit of directory entries was solved by
>>> directory fragmentation, so you should check whether fragmentation is
>>> allowed in your filesystem. You can also try to increase the upper
>>> directory entry limit, but this might lead to other problems (too
>>> large rados omap objects....).
>>>
>>>
>>> Regards,
>>>
>>> Burkhard
>>>
>>>
>> _______________________________________________
>> ceph-users mailing list -- ceph-users(a)ceph.io
>> To unsubscribe send an email to ceph-users-leave(a)ceph.io