[ceph-users] Re: ceph df (octopus) shows USED is 7 times higher than STORED in erasure coded pool

6 Jul 2021

Hey Wladimir,

That output looks like it's from Nautilus or later. My understanding is
that the USED column is in raw bytes, whereas STORED is "user" bytes. If
you're using EC 2:1 for all of those pools, I would expect USED to be at
least 1.5x STORED, which looks to be the case for jerasure21. Perhaps your
libvirt pool is 3x replicated, in which case the numbers add up as well.

Josh

On Tue, Jul 6, 2021 at 5:51 AM Wladimir Mutel &lt;mwg(a)mwg.dp.ua&gt; wrote:

...
          I started my experimental 1-host/8-HDDs setup
in 2018 with
 Luminous,
         and I read
 https://ceph.io/community/new-luminous-erasure-coding-rbd-cephfs/ ,
         which had interested me in using Bluestore and rewriteable EC
 pools for RBD data.
         I have about 22 TiB or raw storage, and ceph df shows this :

 --- RAW STORAGE ---
 CLASS    SIZE    AVAIL    USED  RAW USED  %RAW USED
 hdd    22 TiB  2.7 TiB  19 TiB    19 TiB      87.78
 TOTAL  22 TiB  2.7 TiB  19 TiB    19 TiB      87.78

 --- POOLS ---
 POOL                   ID  PGS   STORED  OBJECTS     USED  %USED  MAX AVAIL
 jerasure21              1  256  9.0 TiB    2.32M   13 TiB  97.06    276 GiB
 libvirt                 2  128  1.5 TiB  413.60k  4.5 TiB  91.77    140 GiB
 rbd                     3   32  798 KiB        5  2.7 MiB      0    138 GiB
 iso                     4   32  2.3 MiB       10  8.0 MiB      0    138 GiB
 device_health_metrics   5    1   31 MiB        9   94 MiB   0.02    138 GiB

         If I add USED for libvirt and jerasure21 , I get 17.5 TiB, and 2.7
 TiB is shown at RAW STORAGE/AVAIL
         Sum of POOLS/MAX AVAIL is about 840 GiB, where are my other
 2.7-0.840 =~ 1.86 TiB ???
         Or in different words, where are my (RAW STORAGE/RAW
 USED)-(SUM(POOLS/USED)) = 19-17.5 = 1.5 TiB ?

         As it does not seem I would get any more hosts for this setup,
         I am seriously thinking of bringing down this Ceph
         and setting up instead a Btrfs storing qcow2 images served over
 iSCSI
         which looks simpler to me for single-host situation.

 Josh Baergen wrote:
  Hey Wladimir,

 I actually don't know where this is referenced in the docs, if anywhere. 
Googling around shows many people discovering this overhead the hard way on
 ceph-users.

 I also don't know the rbd journaling mechanism in enough depth to  comment on
whether it could be causing this issue for you. Are you seeing a
 high
  allocated:stored ratio on your cluster?

 Josh

 On Sun, Jul 4, 2021 at 6:52 AM Wladimir Mutel &lt;mwg(a)mwg.dp.ua <mailto: 
mwg(a)mwg.dp.ua&gt;&gt; wrote:

     Dear Mr Baergen,

     thanks a lot for your very concise explanation,
     however I would like to learn more why default Bluestore alloc.size  causes
such a big storage overhead,
      and where in the Ceph docs it is explained
how and what to watch for  to avoid hitting this phenomenon again and again.
      I have a feeling this is what I get on my
experimental Ceph setup  with simplest JErasure 2+1 data pool.
      Could it be caused by journaled RBD writes to
EC data-pool ?

     Josh Baergen wrote:
      > Hey Arkadiy,
      >
      > If the OSDs are on HDDs and were created with the default
      > bluestore_min_alloc_size_hdd, which is still 64KiB in Octopus,  then in
       > effect data will be allocated from the
pool in 640KiB chunks  (64KiB *
       > (k+m)). 5.36M objects taking up 501GiB
is an average object size  of 98KiB
       > which results in a ratio of 6.53:1
allocated:stored, which is  pretty close
       > to the 7:1 observed.
      >
      > If my assumption about your configuration is correct, then the  only way
to
       > fix this is to adjust
bluestore_min_alloc_size_hdd and recreate  all your
       > OSDs, which will take a while...
      >
      > Josh
      >
      > On Tue, Jun 29, 2021 at 3:07 PM Arkadiy Kulev &lt;eth(a)ethaniel.com 
<mailto:eth@ethaniel.com>> wrote:
       >
      >> The pool *default.rgw.buckets.data* has *501 GiB* stored, but  USED
shows
       >> *3.5
      >> TiB *(7 times higher!)*:*
      >>
      >> root@ceph-01:~# ceph df
      >> --- RAW STORAGE ---
      >> CLASS  SIZE     AVAIL    USED     RAW USED  %RAW USED
      >> hdd    196 TiB  193 TiB  3.5 TiB   3.6 TiB       1.85
      >> TOTAL  196 TiB  193 TiB  3.5 TiB   3.6 TiB       1.85
      >>
      >> --- POOLS ---
      >> POOL                       ID  PGS  STORED   OBJECTS  USED   %USED 
MAX
       >> AVAIL
      >> device_health_metrics       1    1   19 KiB       12   56 KiB    0
       >>   61 TiB
      >> .rgw.root                   2   32  2.6 KiB        6  1.1 MiB    0
       >>   61 TiB
      >> default.rgw.log             3   32  168 KiB      210   13 MiB    0
       >>   61 TiB
      >> default.rgw.control         4   32      0 B        8      0 B    0
       >>   61 TiB
      >> default.rgw.meta            5    8  4.8 KiB       11  1.9 MiB    0
       >>   61 TiB
      >> default.rgw.buckets.index   6    8  1.6 GiB      211  4.7 GiB    0
       >>   61 TiB
      >>
      >> default.rgw.buckets.data   10  128  501 GiB    5.36M  3.5 TiB   1.90
       >> 110 TiB
      >>
      >> The *default.rgw.buckets.data* pool is using erasure coding:
      >>
      >> root@ceph-01:~# ceph osd erasure-code-profile get EC_RGW_HOST
      >> crush-device-class=hdd
      >> crush-failure-domain=host
      >> crush-root=default
      >> jerasure-per-chunk-alignment=false
      >> k=6
      >> m=4
      >> plugin=jerasure
      >> technique=reed_sol_van
      >> w=8
      >>
      >> If anyone could help explain why it's using up 7 times more 
space, it would
       >> help a lot. Versioning is disabled.
ceph version 15.2.13  (octopus stable).
       >>
      >> Sincerely,
      >> Ark.
      >> _______________________________________________
      >> ceph-users mailing list -- ceph-users(a)ceph.io <mailto: 
ceph-users(a)ceph.io&gt;
       >> To unsubscribe send an email to
ceph-users-leave(a)ceph.io  <mailto:ceph-users-leave@ceph.io>
       >>
      > _______________________________________________
      > ceph-users mailing list -- ceph-users(a)ceph.io <mailto: 
ceph-users(a)ceph.io&gt;
       > To unsubscribe send an email to
ceph-users-leave(a)ceph.io <mailto:  ceph-users-leave(a)ceph.io&gt;

2024

2023

2022

2021

2020

2019

[ceph-users] Re: ceph df (octopus) shows USED is 7 times higher than STORED in erasure coded pool