Hey Wladimir,
I actually don't know where this is referenced in the docs, if anywhere.
Googling around shows many people discovering this overhead the hard way on
ceph-users.
I also don't know the rbd journaling mechanism in enough depth to comment
on whether it could be causing this issue for you. Are you seeing a high
allocated:stored ratio on your cluster?
Josh
On Sun, Jul 4, 2021 at 6:52 AM Wladimir Mutel <mwg(a)mwg.dp.ua> wrote:
Dear Mr Baergen,
thanks a lot for your very concise explanation,
however I would like to learn more why default Bluestore alloc.size causes
such a big storage overhead,
and where in the Ceph docs it is explained how and what to watch for to
avoid hitting this phenomenon again and again.
I have a feeling this is what I get on my experimental Ceph setup with
simplest JErasure 2+1 data pool.
Could it be caused by journaled RBD writes to EC data-pool ?
Josh Baergen wrote:
Hey Arkadiy,
If the OSDs are on HDDs and were created with the default
bluestore_min_alloc_size_hdd, which is still 64KiB in Octopus, then in
effect data will be allocated from the pool in 640KiB chunks (64KiB *
(k+m)). 5.36M objects taking up 501GiB is an average object size of 98KiB
which results in a ratio of 6.53:1 allocated:stored, which is pretty
close
to the 7:1 observed.
If my assumption about your configuration is correct, then the only way
to
fix this is to adjust
bluestore_min_alloc_size_hdd and recreate all your
OSDs, which will take a while...
Josh
On Tue, Jun 29, 2021 at 3:07 PM Arkadiy Kulev <eth(a)ethaniel.com> wrote:
> The pool *default.rgw.buckets.data* has *501 GiB* stored, but USED shows
> *3.5
> TiB *(7 times higher!)*:*
>
> root@ceph-01:~# ceph df
> --- RAW STORAGE ---
> CLASS SIZE AVAIL USED RAW USED %RAW USED
> hdd 196 TiB 193 TiB 3.5 TiB 3.6 TiB 1.85
> TOTAL 196 TiB 193 TiB 3.5 TiB 3.6 TiB 1.85
>
> --- POOLS ---
> POOL ID PGS STORED OBJECTS USED %USED
MAX
> AVAIL
> device_health_metrics 1 1 19 KiB 12 56 KiB 0
> 61 TiB
> .rgw.root 2 32 2.6 KiB 6 1.1 MiB 0
> 61 TiB
> default.rgw.log 3 32 168 KiB 210 13 MiB 0
> 61 TiB
> default.rgw.control 4 32 0 B 8 0 B 0
> 61 TiB
> default.rgw.meta 5 8 4.8 KiB 11 1.9 MiB 0
> 61 TiB
> default.rgw.buckets.index 6 8 1.6 GiB 211 4.7 GiB 0
> 61 TiB
>
> default.rgw.buckets.data 10 128 501 GiB 5.36M 3.5 TiB 1.90
> 110 TiB
>
> The *default.rgw.buckets.data* pool is using erasure coding:
>
> root@ceph-01:~# ceph osd erasure-code-profile get EC_RGW_HOST
> crush-device-class=hdd
> crush-failure-domain=host
> crush-root=default
> jerasure-per-chunk-alignment=false
> k=6
> m=4
> plugin=jerasure
> technique=reed_sol_van
> w=8
>
> If anyone could help explain why it's using up 7 times more space, it
would
> help a lot. Versioning is disabled. ceph
version 15.2.13 (octopus
stable).
Sincerely,
Ark.
_______________________________________________
ceph-users mailing list -- ceph-users(a)ceph.io
To unsubscribe send an email to ceph-users-leave(a)ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users(a)ceph.io
To unsubscribe send an email to ceph-users-leave(a)ceph.io