From what we have experienced, our delete speed scales with the CPU
available to the MDS. And the MDS only seems to scale to 2-4 CPUs per
daemon, so for our biggest filesystem, we have 5 active MDS daemons.
Migrations reduced performance a lot, but pinning fixed that. Even better
is just getting the fastest cores you can get.
On Thu., Nov. 12, 2020, 6:08 p.m. Brent Kennedy, <bkennedy(a)cfl.rr.com>
wrote:
Ceph is definitely a good choice for storing millions
of files. It sounds
like you plan to use this like s3, so my first question would be: Are the
deletes done for a specific reason? ( e.g. the files are used for a
process and discarded ) If its an age thing, you can set the files to
expire when putting them in, then ceph will automatically clear them.
The more spinners you have the more performance you will end up with.
Network 10Gb or higher?
Octopus is production stable and contains many performance enhancements.
Depending on the OS, you may not be able to upgrade from nautilus until
they work out that process ( e.g. centos 7/8 ).
Delete speed is not that great but you would have to test it with your
cluster to see how it performs for your use case. If you have enough space
present, is there a process that breaks if the files are not deleted?
Regards,
-Brent
Existing Clusters:
Test: Ocotpus 15.2.5 ( all virtual on nvme )
US Production(HDD): Nautilus 14.2.11 with 11 osd servers, 3 mons, 4
gateways, 2 iscsi gateways
UK Production(HDD): Nautilus 14.2.11 with 18 osd servers, 3 mons, 4
gateways, 2 iscsi gateways
US Production(SSD): Nautilus 14.2.11 with 6 osd servers, 3 mons, 4
gateways, 2 iscsi gateways
UK Production(SSD): Octopus 15.2.5 with 5 osd servers, 3 mons, 4 gateways
-----Original Message-----
From: Adrian Nicolae <adrian.nicolae(a)rcs-rds.ro>
Sent: Wednesday, November 11, 2020 3:42 PM
To: ceph-users <ceph-users(a)ceph.io>
Subject: [ceph-users] question about rgw delete speed
Hey guys,
I'm in charge of a local cloud-storage service. Our primary object storage
is a vendor-based one and I want to replace it in the near future with Ceph
with the following setup :
- 6 OSD servers with 36 SATA 16TB drives each and 3 big NVME per server
(1 big NVME for every 12 drives so I can reserve 300GB NVME storage for
every SATA drive), 3 MON, 2 RGW with Epyc 7402p and 128GB RAM. So in the
end we'll have ~ 3PB of raw data and 216 SATA drives.
Currently we have ~ 100 millions of files on the primary storage with the
following distribution :
- ~10% = very small files ( less than 1MB - thumbnails, text&office files
and so on)
- ~60%= small files (between 1MB and 10MB)
- 20% = medium files ( between 10MB and 1GB)
- 10% = big files (over 1GB).
My main concern is the speed of delete operations. We have around
500k-600k delete ops every 24 hours so quite a lot. Our current storage is
not deleting all the files fast enough (it's always 1 week-10 days
behind) , I guess is not only a software issue and probably the delete
speed will get better if we add more drives (we now have 108).
What do you think about Ceph delete speed ? I read on other threads that
it's not very fast . I wonder if this hw setup can handle our current
delete load better than our current storage. On RGW servers I want to use
Swift , not S3.
And another question : can I start deploying in production directly the
latest Ceph version (Octopus) or is it safer to start with Nautilus until
Octopus will be more stable ?
Any input would be greatly appreciated !
Thanks,
Adrian.
_______________________________________________
ceph-users mailing list -- ceph-users(a)ceph.io To unsubscribe send an
email to ceph-users-leave(a)ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users(a)ceph.io
To unsubscribe send an email to ceph-users-leave(a)ceph.io