[ceph-users] Re: cephfs file layouts, empty objects in first data pool

10 Feb 2020

I was also confused by this topic and had intended to post a question 
this week.  The documentation I recall reading said something about 'if 
you want to use erasure coding on a CephFS, you should use a small 
replicated data pool as the first pool, and your erasure coded pool as 
the second.'  I did not see any obvious indication of how this would 
'auto-magically' put the small files in the replicated pool and the 
large files in the erasure pool. although this sounds like a desirable 
behavior.  Instead I found the notes 'file layouts' which doesn't seem 
to be able to use size as a criterion.

Does anybody have anything further to add that would help clarify this?

Thanks.

-Dave

Dave Hall
Binghamton University

On 2/10/20 1:26 PM, Gregory Farnum wrote:
> On Mon, Feb 10, 2020 at 12:29 AM Håkan T Johansson &lt;f96hajo(a)chalmers.se&gt;
wrote:
>>
>> On Mon, 10 Feb 2020, Gregory Farnum wrote:
>>
>>> On Sun, Feb 9, 2020 at 3:24 PM Håkan T Johansson &lt;f96hajo(a)chalmers.se&gt;
wrote:
>>>
>>>        Hi,
>>>
>>>        running 14.2.6, debian buster (backports).
>>>
>>>        Have set up a cephfs with 3 data pools and one metadata pool:
>>>        myfs_data, myfs_data_hdd, myfs_data_ssd, and myfs_metadata.
>>>
>>>        The data of all files are with the use of ceph.dir.layout.pool either
>>>        stored in the pools myfs_data_hdd or myfs_data_ssd.  This has also
been
>>>        checked by dumping the ceph.file.layout.pool attributes of all files.
>>>
>>>        The filesystem has 1617949 files and 36042 directories.
>>>
>>>        There are however approximately as many objects in the first pool
created
>>>        for the cephfs, myfs_data, as there are files.  They also becomes more
or
>>>        fewer as files are created or deleted (so cannot be some leftover
from
>>>        earlier exercises).  Note how the USED size is reported as 0 bytes,
>>>        correctly reflecting that no file data is stored in them.
>>>
>>>        POOL_NAME        USED OBJECTS CLONES  COPIES MISSING_ON_PRIMARY
UNFOUND DEGRADED  RD_OPS      RD   WR_OPS      WR USED COMPR UNDER COMPR
>>>        myfs_data         0 B 1618229      0 4854687                  0      
0        0 2263590 129 GiB 23312479 124 GiB        0 B         0 B
>>>        myfs_data_hdd 831 GiB  136309      0  408927                  0      
0        0  106046 200 GiB   269084 277 GiB        0 B         0 B
>>>        myfs_data_ssd  43 GiB 1552412      0 4657236                  0      
0        0  181468 2.3 GiB  4661935  12 GiB        0 B         0 B
>>>        myfs_metadata 1.2 GiB   36096      0  108288                  0      
0        0 4828623  82 GiB  1355102 143 GiB        0 B         0 B
>>>
>>>        Is this expected?
>>>
>>>        I was assuming that in this scenario, all objects, both their data and
any
>>>        keys would be either in the metadata pool, or the two pools where the
>>>        objects are stored.
>>>
>>>        Is it some additional metadata keys that are stored in the first
>>>        created data pool for cephfs?  This would not be so nice in case the
osd
>>>        selection rules for it are using worse disks than the data itself...
>>>
>>>
>>>
https://docs.ceph.com/docs/master/cephfs/file-layouts/#adding-a-data-pool-t… notes
there is “a small amount of metadata” kept in the primary pool.
>> Thanks!  This I managed to miss, probably as it was at the bottom of the
>> page.  In case one wants to use layouts to separate fast (likely many)
>> from slow (likely large) files, it then sounds as the primary pool should
>> the fast kind too, due to the large amount of objects.  Thus this needs to
>> be highlighted early in that documentation.
>>
>>> That’s not terribly clear; what is actually stored is a per-file location
backtrace (its location in the directory tree) used for hardlink lookups and disaster
recovery
>>> scenarios.
>> This info would be nice to add to the manual page.  It is nice to know
>> what kind of information is stored there.
> Yeah, PRs welcome. :p
> Just to be clear though, that shouldn't be performance-critical. It's
> lazily updated by the MDS when the directory location changes, but not
> otherwise.
>
>> Again thanks for the clarification!
>>
>>>        Btw: is there any tool to see the amount of key value data size
associated
>>>        with a pool?  'ceph osd df' gives omap and meta for osds, but
not broken
>>>        down per pool.
>>>
>>>
>>> I think this is in the newest master code, but I’m not certain which release
it’s in...
>> Would it then (when available) also be in the 'rados df' command?
> I really don't remember how everything is shared out but I think so?
>
>> Best regards,
>> Håkan
>>
>>
>>> -Greg
>>>
>>>
>>>
>>>        Best regards,
>>>        Håkan
>>>        _______________________________________________
>>>        ceph-users mailing list -- ceph-users(a)ceph.io
>>>        To unsubscribe send an email to ceph-users-leave(a)ceph.io
>>>
>>>
>>>
> _______________________________________________
> ceph-users mailing list -- ceph-users(a)ceph.io
> To unsubscribe send an email to ceph-users-leave(a)ceph.io

2024

2023

2022

2021

2020

2019

[ceph-users] Re: cephfs file layouts, empty objects in first data pool