Hi,
my Name is Moritz and I am working for a 3D production company. Because of the corona virus I have too much time left and also to much unused hardware. That is why I started playing around with Ceph as a fileserver for us. Here I want to share my experience for all those who are interested. To start of here is my actual running test system. I am interested in the thoughts of the community and also on more suggestions on what to try out with my available Hardware. I don’t know how to test it right now because I am a newbie to ceph and our production file server is a super user-friendly but high performance Synology NAS 😉. All I have done so far was running Crystal disk benchmark on 1 Windows machine on the SMB Share.
3 Nodes: (original those where render workstations that are not in use right now)
Each Node is MON MGM OSD
Mainboard: ASRock TRX40 Creator
CPU: AMD Ryzen Threadripper 3960X, 24 Cores, 3.8Ghz
RAM: 2 x Samsung 32 GB 2 x 8 DDR4 2666 MHz 288-pin DIMM, Unregistered, ECC (64 GB Total)
NIC Public: OnBoard Aquantia 107, 10Gbit
NIC Ceph: Intel XXV710-DA2, 2x SFP28, 25Gbit
System Drive: 2x Samsung SSD 860 PRO 256GB, SATA, ZFS Raid 1
System: Proxmox VE 6.2, Debian Buster, Ceph Nautilus
HBA: Broadcom SAS 9305-16i
OSDs:
6x Seagate Exos, 16TB, 7.200 rpm, 12Gb SAS
Cache:
1x Micron 9300 MAX 3.2TB U.2 NVME
I Played around with setting it up as a WAL/DB Device. Right now I have configured the Micron NVME as a BCache Infront of the 6 Seagate Drives in writeback mode.
Because in this configuration BCache takes care of translating random writes to sequential ones for the HDDs I turned the Ceph WAL LOG off. I think Bcache gives more options to tune the System for my use case instead of just putting WAL/DB on the NVME. And also I can easily add cache drives or remove them without touching osds.
I set up SMB Shares with the vfs_ceph module. I still have to add CTDB to distribute Samba to all nodes.
My Next steps are to keep playing around in tuning the system and testing stability and performance. After that I want to put the Ceph cluster infront of our production NAS. Because our data is not super critical I thought of setting the replicas to 2 and running Rsync overnight to our NAS. That way I can switch to the old NAS at any time and wouldn’t loose more than 1 Day of work which is acceptable for us. This is how I could compare the two solutions side by side with real-life workload.
I know that ceph might not be the best solution right now but if I am able to get at least similar performance to our Synology HDD NAS out of it, it would give a super scalable Solution in size and performance to grow with our needs. And who knows what performance improvements we get with ceph in the next 3 years.
I am happy to hear your thoughts and ideas. And please I know this might be kind of a crazy setup but I have fun with it and I learned a lot the last few weeks. If my experiment fails I will go back to my original plan: Put FreeNas on two of the Nodes with overnight replication and put the third Node back to his render-friends. 😃
By the way I also have a spare Dell Server: 2x Xeon E5-2630 v3 2,40GHz, 128G Ram. I just don’t have an idea on how to utilize it. Maybe as extra OSD Node or as a separate Samba Server to get the SMB traffic away from the Public Ceph Network.
Moritz Wilhelm
Hello Samuel,
Why you want to do that?
Just remove the rbd - as long your Image is distributet over random OSDs there is no way to recovery a delete Image..
...2 cents
Mehmet
Am 12. Mai 2020 12:55:22 MESZ schrieb "huxiaoyu(a)horebdata.cn" <huxiaoyu(a)horebdata.cn>:
>Hi, Ceph folks,
>
>Is there a rbd command, or any other way, to zero out rbd images or
>volume? I would like to write all zero data to an rbd image/volume
>before remove it.
>
>Any comments would be appreciated.
>
>best regards,
>
>samuel
>Horebdata AG
>Switzerland
>
>
>
>
>huxiaoyu(a)horebdata.cn
>_______________________________________________
>ceph-users mailing list -- ceph-users(a)ceph.io
>To unsubscribe send an email to ceph-users-leave(a)ceph.io
Hi All,
I have question regarding Ceph Nautilus upgrade. In our test environment
upgrading
Luminous to Nautilus 14.2.8, and after enable msgr2, we seen one of the mon
node restarted, my question is this normal process of restart mon service
and 2nd question is, we using below mon_host format that can be correct or
move to v2?
mon_host = 10.44.172.181,10.44.172.182,10.44.172.183
Thanks,
AmitG
Hi,
I am trying to setup NFS ganesh in Ceph Nautilus.
In a ubuntu 18.04 system i have installed nfs-ganesha (v2.6) and
nfs-ganesha-ceph pkg and followed the steps in the link
https://docs.ceph.com/docs/nautilus/cephfs/nfs/ but i am not able to
export my cephfs volume there is no error msg in nfs-ganesha, also i doubt
whether its loading nfs-ganesha-ceph config file from "/etc/ganesha" folder.
From same system i am able to mount thru ceph kernel client without any
issue?
How do i make this work?
regards
Amudhan
Hi Marc,
I read
https://blog.dachary.org/2015/05/12/ceph-jerasure-and-isa-plugins-benchmark…
and
it seem there is significant performance difference between benchmarked
plugins. But, it was with old Intel E3-1200v2 series. I have read some
others EC benchmark result improvements with current CPU but I cannot find
it for Ceph implementation.
Best regards,
On Fri, May 15, 2020 at 9:03 PM Marc Roos <M.Roos(a)f1-outsourcing.eu> wrote:
>
> How many % of the latency is even CPU related?
>
>
>
> -----Original Message-----
> From: Lazuardi Nasution [mailto:mrxlazuardin@gmail.com]
> Sent: 15 May 2020 16:00
> To: ceph-users(a)ceph.io
> Subject: [ceph-users] EC Plugins Benchmark with Current Intel/AMD CPU
>
> Hi,
>
> Is there any EC plugins benchmark with current Intel/AMD CPU? It seem
> there are new instructions which may accelerate EC. Let's say we want to
> benchmark plugins using Intel 6200 or AMD 7002 series. I hope there is
> better result than what have been benchmarked some years ago.
>
> Best regards,
> _______________________________________________
> ceph-users mailing list -- ceph-users(a)ceph.io To unsubscribe send an
> email to ceph-users-leave(a)ceph.io
>
>
>
Hi,
Is there any EC plugins benchmark with current Intel/AMD CPU? It seem there
are new instructions which may accelerate EC. Let's say we want to
benchmark plugins using Intel 6200 or AMD 7002 series. I hope there is
better result than what have been benchmarked some years ago.
Best regards,
> Hi,
>
> I have two users both belong to different tenant.
>
> Can I give permission for the user in another tenant to access the bucket
> using setacl or setPolicy command ?
> I tried the setacl command and setpolicy command, but it was not working ?
> It used to say bucket not found, when the grantee tried to access.
>
> Is this supported ?
>
> *Thanks & Regards,*
> *Vishwas *
>
>
Dear all,
We're running Ceph Luminous and we've recently hit an issue with some OSD's (autoout states, IO/CPU overload) which unfortunately resulted with one placement group with the state "stale+active+clean", it's a placement group from .rgw.root pool:
1.15 0 0 0 0 0 0 1 1 stale+active+clean 2020-05-11 23:22:51.396288 40'1 2142:152 [3,2,6] 3 [3,2,6] 3 40'1 2020-04-22 00:46:05.904418 40'1 2020-04-20 20:18:13.371396 0
I guess there is no active replica of that object anywhere on the cluster. Restarting osd.3, osd.2 or osd.6 daemons does not help.
I've used ceph-objectstore-tool and successfully exported placement group from osd.3, osd.2 and osd.6 and tried to import it on a completely different OSD, the exports differ in filesize slightly, but the osd.3 wihch was the latest primary is the biggest so I've tried to import it on a different OSD, when starting up I see the following (this is from osd.1):
2020-05-14 21:43:19.779740 7f7880ac3700 1 osd.1 pg_epoch: 2459 pg[1.15( v 40'1 (0'0,40'1] local-lis/les=2073/2074 n=0 ec=73/39 lis/c 2073/2073 les/c/f 2074/2074/633 2145/39/2145) [] r=-1 lpr=2455 crt=40'1 lcod 0'0 unknown NOTIFY] state<Start>: transitioning to Stray
I see from previous pg dumps (several weeks before while it was still active+clean) that it was 115 bytes with zero objects in it but I am not sure how to interpret that.
As this is a pg from .rgw.root pool, I cannot get any response from the cluster when accessing (everything timeouts).
What is the correct course of action with this pg?
Any help would be greatly appriciated.
Thanks,
Tomislav
Hi list,
Thanks again for pointing me towards rbd-mirror!
I've read documentation, old mailing list posts, blog posts and some
additional guides. Seems like the tool to help me through my data migration.
Given one-way synchronisation and image-based (so, not pool based)
configuration, it's still unclear to me how the mirroring will cope with
an existing target pool, already consisting of (a lot of) images.
Has someone done this already? It feels quite scary with doom scenario's
like "cleaning up" the target pool and such in mind...
To sum up: my goal is to mirror clusterA/somepool with some specific
images to clusterB/someotherpool where already other images reside. The
mirrored images should be kept in sync and the other images should be
left alone completely.
Cheers,
Kees
--
https://nefos.nl/contact
Nefos IT bv
Ambachtsweg 25 (industrienummer 4217)
5627 BZ Eindhoven
Nederland
KvK 66494931
/Aanwezig op maandag, dinsdag, woensdag en vrijdag/