Thanks everyone.
So the 3/30/300GB restriction no longer exists in Octopus, so can I make it
10GB and it will use all 10GB?
Is there a migration strategy that allows me to setup the DB on the OSD,
see how much metadata my 25TB is using, make a partition on the Optane say
quadruple the size and then move the DB to the Optane?
Or maybe the best strategy would be to start with a small logical volume on
the Optane, copy over my 25TB of existing data and extend it if required?
The bluefs-bdev-migrate and bluefs-bdev-expand commnds seem to be the
ticket.
On 27 Nov 2020 at 6:19:06 am, Christian Wuerdig <christian.wuerdig(a)gmail.com>
wrote:
Sorry, I replied to the wrong email thread before, so
reposting this:
I think it's time to start pointing out the the 3/30/300 logic not really
holds any longer true post Octopus:
https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/message/CKRCB3HUR7…
Although I suppose in a way this makes it even harder to provide a sizing
recommendation
On Fri, 27 Nov 2020 at 04:49, Burkhard Linke <
Burkhard.Linke(a)computational.bio.uni-giessen.de> wrote:
Hi,
On 11/26/20 12:45 PM, Richard Thornton wrote:
Hi,
Sorry to bother you all.
It’s a home server setup.
Three nodes (ODROID-H2+ with 32GB RAM and dual
2.5Gbit NICs), two 14TB
7200rpm SATA drives and an Optane 118GB NVMe in
each node (OS boots from
eMMC).
*snipsnap*
Is there a rough CephFS calculation (each file
uses x bytes of
metadata), I
think I should be safe with 30GB, now I read I
should double that (you
should allocate twice the size of the biggest
layer to allow for
compaction) but I only have 118GB and two OSDs so
I will have to go for
59GB (or whatever will fit)?
The recommended size of 30 GB is due to the level design of rocksdb;
data is stored in different cache levels with increasing level sizes. 30
GB is a kind of sweet spot between 3 GB and 300 GB (too small / way too
large for most use case). The recommendation for doubling the size for
compaction is OK, but you will waste capacity most the time.
In our cephfs instance we have ~ 115.000.000 files. Metadata is stored
on 18 SSD based OSDs. About 30-35 GB raw capacity of the data is
currently in use, almost exclusively for metadata, omap and other stuff.
You might be able to scale this down to your use case. Our average file
size approx. 5 MB, so you can also put a little bit on top in your case.
If your working set (files accesses in a time span) is rather small, you
also have the option to use the SSD for some block device caching layer
like bcache or dmcache. In this setup the whole capacity will be used,
and also data operations on the OSDs will benefit from the faster SSDs.
Your failure domain will be the same; if the SSD dies your data disks
will be useless.
Otherwise I would recommend to use DB partitions of the recommended size
(do not forget to include some extra space for the WAL), and use the
remaining capacity for extra SSD based OSDs similar to our setup. This
willensure that metadata access will be fast[tm].
Regards,
Burkhard
_______________________________________________
ceph-users mailing list -- ceph-users(a)ceph.io
To unsubscribe send an email to ceph-users-leave(a)ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users(a)ceph.io
To unsubscribe send an email to ceph-users-leave(a)ceph.io