On Mon, Feb 10, 2020 at 12:29 AM Håkan T Johansson
<f96hajo(a)chalmers.se> wrote:
On Mon, 10 Feb 2020, Gregory Farnum wrote:
On Sun, Feb 9, 2020 at 3:24 PM Håkan T Johansson
<f96hajo(a)chalmers.se> wrote:
Hi,
running 14.2.6, debian buster (backports).
Have set up a cephfs with 3 data pools and one metadata pool:
myfs_data, myfs_data_hdd, myfs_data_ssd, and myfs_metadata.
The data of all files are with the use of
ceph.dir.layout.pool either
stored in the pools myfs_data_hdd or
myfs_data_ssd. This
has also been
checked by dumping the
ceph.file.layout.pool attributes of
all files.
The filesystem has 1617949 files and 36042 directories.
There are however approximately as many objects in the
first pool created
for the cephfs, myfs_data, as there are
files. They also
becomes more or
fewer as files are created or deleted (so
cannot be some
leftover from
earlier exercises). Note how the USED size
is reported as 0 bytes,
correctly reflecting that no file data is stored in them.
POOL_NAME USED OBJECTS CLONES COPIES
MISSING_ON_PRIMARY UNFOUND
DEGRADED RD_OPS RD WR_OPS
WR USED COMPR UNDER COMPR
myfs_data 0 B 1618229 0
4854687
0 0 0 2263590 129 GiB 23312479 124 GiB
0 B
0 B
myfs_data_hdd 831 GiB 136309 0
408927
0 0 0 106046 200 GiB 269084 277 GiB
0 B
0 B
myfs_data_ssd 43 GiB 1552412 0
4657236
0 0 0 181468 2.3 GiB 4661935 12 GiB
0 B
0 B
myfs_metadata 1.2 GiB 36096 0
108288
0 0 0 4828623 82 GiB 1355102 143 GiB
0 B
0 B
Is this expected?
I was assuming that in this scenario, all objects, both
their data and
any
keys would be either in the metadata pool,
or the two pools
where the
objects are stored.
Is it some additional metadata keys that are stored in the first
created data pool for cephfs? This would not be so nice in
case the osd
selection rules for it are using worse
disks than the data itself...
https://docs.ceph.com/docs/master/cephfs/file-layouts/#adding-a-data-pool-t… notes
there is “a small amount of metadata” kept in the primary
pool.
Thanks! This I managed to miss, probably as it was at the bottom of the
page. In case one wants to use layouts to separate fast (likely many)
from slow (likely large) files, it then sounds as the primary pool should
the fast kind too, due to the large amount of objects. Thus this needs to
be highlighted early in that documentation.
That’s not terribly clear; what is actually
stored is a per-file
location backtrace (its location in the directory tree) used
for
hardlink lookups and disaster recovery
scenarios.
This info would be nice to add to the manual page. It is nice to know
what kind of information is stored there.
Yeah, PRs welcome. :p
Just to be clear though, that shouldn't be performance-critical. It's
lazily updated by the MDS when the directory location changes, but not
otherwise.
Again thanks for the clarification!
Btw: is there any tool to see the amount of
key value data
size associated
with a pool? 'ceph osd df' gives
omap and meta for osds,
but not broken
down per pool.
I think this is in the newest master code, but I’m not certain
which release
it’s in...
Would it then (when available) also be in the 'rados df' command?
Best regards,
Håkan
> -Greg
>
>
>
> Best regards,
> Håkan
> _______________________________________________
> ceph-users mailing list -- ceph-users(a)ceph.io
> To unsubscribe send an email to ceph-users-leave(a)ceph.io
>
>
>
_______________________________________________
ceph-users mailing list -- ceph-users(a)ceph.io
To unsubscribe send an email to ceph-users-leave(a)ceph.io