March 2020 - ceph-users - lists.ceph.io

by Willi Schiegel

Hello All, I have a HW RAID based 240 TB data pool with about 200 million files for users in a scientific institution. Data sizes range from tiny parameter files for scientific calculations and experiments to huge images of brain scans. There are group directories, home directories, Windows roaming profile directories organized in ZFS pools on Solaris operating systems, exported via NFS and Samba to Linux, macOS, and Windows clients. I would like to switch to CephFS because of the flexibility and expandability but I cannot find any recommendations for which storage backend would be suitable for all the functionality we have. Since I like the features of ZFS like immediate snapshots of very large data pools, quotas for each file system within hierarchical data trees and dynamic expandability by simply adding new disks or disk images without manual resizing would it be a good idea to create RBD images, map them onto the file servers and create zpools on the mapped images? I know that ZFS best works with raw disks but maybe a RBD image is close enough to a raw disk? Or would CephFS be the way to go? Can there be multiple CephFS pools for the group data folders and for the user's home directory folders for example or do I have to have everything in one single file space? Maybe someone can share his or her field experience? Thank you very much. Best regards Willi

3 years, 10 months

2
1
0 0

Ceph and iSCSI

by Bobby

Hi all, I am new to Ceph. But I have a some good understanding of iSCSI protocol. I will dive into Ceph because it looks promising. I am particularly interested in Ceph-RBD. I have a request. Can you please tell me, if any, what are the common similarities between iSCSI and Ceph. If someone has to work on a common model for iSCSI and Ceph, what would be those significant points you would suggest to someone who has some understanding of iSCSI? Looking forward to answers. Thanks in advance :-) BR

3 years, 10 months

2
1
0 0

Very bad performance on a ceph rbd pool via iSCSI to VMware esx

by Salsa

I have a 3 hosts, 10 4TB HDDs per host ceph storage set up. I deined a 3 replica rbd pool and some images and presented them to a Vmware host via ISCSI, but the write performance is so bad the I managed to freeze a VM doing a big rsync to a datastore inside ceph and had to reboot it's host (seems I've filled up Vmware's ISCSI queue). Right now I'm getting write latencies from 20ms to 80 ms (per OSD) and sometimes peaking at 600 ms (per OSD). Client throughput is giving me around 4 MBs. Using a 4MB stripe 1 image I got 1.955..359 B/s inside the VM. On a 1MB stripe 1 I got 2.323.206 B/s inside the same VM. I think the performance is way too slow, much more than should be and that I can fix this by correcting some configuration. Any advices? -- Salsa Sent with [ProtonMail](https://protonmail.com) Secure Email.

3 years, 10 months

2
2
0 0

Ceph and Windows - experiences or suggestions

by Lars Täuber

Hi there! I got the task to connect a Windows client to our existing ceph cluster. I'm looking for experiences or suggestions from the community. There came two possibilities to my mind: 1. iSCSI Target on RBD exported to Windows 2. NFS-Ganesha on CephFS exported to Windows Is there a third way exporting a ceph cluster to a windows machine? I have some experiences with CephFS. We have a small cluster successfully running for linux clients. I don't have experiences with RBD or iSCSI. The Windows machine will use the space for backups. The kind of data is unknown. I expect the data to be a MS-SQL dump and user data from a sharepoint system. The Windows admin does not care whether NFS or iSCSI is used. I'd be happy if some of you could share experiences. Thanks Lars

3 years, 10 months

10
10
0 0

ceph with rdma can not mount with kernel

by 李亚锋

3 years, 10 months

2
1
0 0

Ceph on CentOS 8?

by Jan Kasprzak

Hello, Ceph users, does anybody use Ceph on recently released CentOS 8? Apparently there are no el8 packages neither at download.ceph.com, nor in the native CentOS package tree. I am thinking about upgrading my cluster to C8 (because of other software running on it apart from Ceph). Do el7 packages simply work? Can they be rebuilt using rpmbuild --rebuild? Or is running Ceph on C8 more complicated than that? Thanks, -Yenya -- | Jan "Yenya" Kasprzak <kas at {fi.muni.cz - work | yenya.net - private}> | | http://www.fi.muni.cz/~kas/ GPG: 4096R/A45477D5 | sir_clive> I hope you don't mind if I steal some of your ideas? laryross> As far as stealing... we call it sharing here. --from rcgroups

3 years, 10 months

9
8
0 0

rbd image naming convention

by Palanisamy

Hello Team, We've integrated Ceph cluster storage with Kubernetes and provisioning volumes through rbd-provisioner. When we're creating volumes from yaml files in Kubernetes, pv > pvc > mounting to pod, In kubernetes end pvc are showing as meaningful naming convention as per yaml file defined. But in ceph cluster, rbd image name is creating with dynamic uid. During troubleshooting time, this will be tedious to find exact rbd image. Please find the provisioning logs in below pasted snippet. kubectl get pods,pv,pvc NAME READY STATUS RESTARTS AGE pod/sleepypod 1/1 Running 0 4m9s NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE persistentvolume/pvc-cd37d2d6-cecc-4a05-9736-c8d80abde7f5 1Gi RWO Delete Bound default/test-dyn-pvc ceph-rbd 4m9s NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE persistentvolumeclaim/test-dyn-pvc Bound pvc-cd37d2d6-cecc-4a05-9736-c8d80abde7f5 1Gi RWO ceph-rbd 4m11s *rbd-provisioner logs* I1121 10:59:15.009012 1 provision.go:132] successfully created rbd image "kubernetes-dynamic-pvc-f4eac482-0c4d-11ea-8d70-8a582e0eb4e2" I1121 10:59:15.009092 1 controller.go:1087] provision "default/test-dyn-pvc" class "ceph-rbd": volume "pvc-cd37d2d6-cecc-4a05-9736-c8d80abde7f5" provisioned I1121 10:59:15.009138 1 controller.go:1101] provision "default/test-dyn-pvc" class "ceph-rbd": trying to save persistentvvolume "pvc-cd37d2d6-cecc-4a05-9736-c8d80abde7f5" I1121 10:59:15.020418 1 controller.go:1108] provision "default/test-dyn-pvc" class "ceph-rbd": persistentvolume "pvc-cd37d2d6-cecc-4a05-9736-c8d80abde7f5" saved I1121 10:59:15.020476 1 controller.go:1149] provision "default/test-dyn-pvc" class "ceph-rbd": succeeded I1121 10:59:15.020802 1 event.go:221] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"default", Name:"test-dyn-pvc", UID:"cd37d2d6-cecc-4a05-9736-c8d80abde7f5", APIVersion:"v1", ResourceVersion:"24545639", FieldPath:""}): type: 'Normal' reason: 'ProvisioningSucceeded' Successfully provisioned volume pvc-cd37d2d6-cecc-4a05-9736-c8d80abde7f5 *rbd image details in Ceph cluster end* rbd -p kube ls --long NAME SIZE PARENT FMT PROT LOCK kubernetes-dynamic-pvc-f4eac482-0c4d-11ea-8d70-8a582e0eb4e2 1 GiB 2 is there way to setup proper naming convention for rbd image as well during kubernetes deployment itself. Kubernetes version: v1.15.5 Ceph cluster version: 14.2.2 nautilus (stable) *Best Regards,* *Palanisamy*

3 years, 10 months

2
2
0 0

cephfs file layouts, empty objects in first data pool

by Håkan T Johansson

Hi, running 14.2.6, debian buster (backports). Have set up a cephfs with 3 data pools and one metadata pool: myfs_data, myfs_data_hdd, myfs_data_ssd, and myfs_metadata. The data of all files are with the use of ceph.dir.layout.pool either stored in the pools myfs_data_hdd or myfs_data_ssd. This has also been checked by dumping the ceph.file.layout.pool attributes of all files. The filesystem has 1617949 files and 36042 directories. There are however approximately as many objects in the first pool created for the cephfs, myfs_data, as there are files. They also becomes more or fewer as files are created or deleted (so cannot be some leftover from earlier exercises). Note how the USED size is reported as 0 bytes, correctly reflecting that no file data is stored in them. POOL_NAME USED OBJECTS CLONES COPIES MISSING_ON_PRIMARY UNFOUND DEGRADED RD_OPS RD WR_OPS WR USED COMPR UNDER COMPR myfs_data 0 B 1618229 0 4854687 0 0 0 2263590 129 GiB 23312479 124 GiB 0 B 0 B myfs_data_hdd 831 GiB 136309 0 408927 0 0 0 106046 200 GiB 269084 277 GiB 0 B 0 B myfs_data_ssd 43 GiB 1552412 0 4657236 0 0 0 181468 2.3 GiB 4661935 12 GiB 0 B 0 B myfs_metadata 1.2 GiB 36096 0 108288 0 0 0 4828623 82 GiB 1355102 143 GiB 0 B 0 B Is this expected? I was assuming that in this scenario, all objects, both their data and any keys would be either in the metadata pool, or the two pools where the objects are stored. Is it some additional metadata keys that are stored in the first created data pool for cephfs? This would not be so nice in case the osd selection rules for it are using worse disks than the data itself... Btw: is there any tool to see the amount of key value data size associated with a pool? 'ceph osd df' gives omap and meta for osds, but not broken down per pool. Best regards, Håkan

3 years, 10 months

4
6
0 0

ceph osd set-require-min-compat-client jewel failure

by 潘东元

hi,every one, my ceph version 12.2.12，I want to set require min compat client luminous,I use command #ceph osd set-require-min-compat-client luminous but ceph report:Error EPERM: cannot set require_min_compat_client to luminous: 4 connected client(s) look like jewel (missing 0xa00000000200000); add --yes-i-really-mean-it to do it anyway [root@node-1 ~]# ceph features { "mon": { "group": { "features": "0x3ffddff8eeacfffb", "release": "luminous", "num": 3 } }, "osd": { "group": { "features": "0x3ffddff8eeacfffb", "release": "luminous", "num": 15 } }, "client": { "group": { "features": "0x40106b84a842a52", "release": "jewel", "num": 4 }, "group": { "features": "0x3ffddff8eeacfffb", "release": "luminous", "num": 168 } } } so,I run command: [root@node-1 gyt]# ceph osd set-require-min-compat-client luminous --yes-i-really-mean-it set require_min_compat_client to luminous but now,I want to set require min compat client jewel,I use command： [root@node-1 gyt]# ceph osd set-require-min-compat-client jewel Error EPERM: osdmap current utilizes features that require luminous; cannot set require_min_compat_client below that to jewel what‘s the way we are set luminous chang to jewel？

3 years, 11 months

3
2
0 0

How many MDS servers

by Robert Ruge

Quick question Ceph guru's. For a 1.1PB raw cephfs system currently storing 191TB of data and 390 million objects (mostly small Python, ML training files etc.) how many MDS servers should I be running? System is Nautilus 14.2.8. I ask because up to know I have run one MDS with one standby-replay and occasionally it blows up with large memory consumption, 60Gb+ even though I have mds_cache_memory_limit = 32G and that was 16G until recently. It of course tries to restart on another MDS node fails again and after several attempts usually comes back up. Today I increased to two active MDS's but the question is what is the optimal number for a pretty active system? The single MDS seemed to regularly run around 1400 req/s and I often get up to six clients failing to respond to cache pressure. The current setup is: ceph fs status cephfs - 71 clients ====== +------+----------------+--------+---------------+-------+-------+ | Rank | State | MDS | Activity | dns | inos | +------+----------------+--------+---------------+-------+-------+ | 0 | active | a | Reqs: 447 /s | 12.0M | 11.9M | | 1 | active | b | Reqs: 154 /s | 1749k | 1686k | | 1-s | standby-replay | c | Evts: 136 /s | 1440k | 1423k | | 0-s | standby-replay | d | Evts: 402 /s | 16.8k | 298 | +------+----------------+--------+---------------+-------+-------+ +-----------------+----------+-------+-------+ | Pool | type | used | avail | +-----------------+----------+-------+-------+ | cephfs_metadata | metadata | 160G | 169G | | cephfs_data | data | 574T | 140T | +-----------------+----------+-------+-------+ +-------------+ | Standby MDS | +-------------+ | w | | x | | y | | z | +-------------+ MDS version: ceph version 14.2.8 (2d095e947a02261ce61424021bb43bd3022d35cb) nautilus (stable) Regards. Robert Ruge Systems & Network Manager Faculty of Science, Engineering & Built Environment [cid:image001.png@01D36789.04BE09A0] Important Notice: The contents of this email are intended solely for the named addressee and are confidential; any unauthorised use, reproduction or storage of the contents is expressly prohibited. If you have received this email in error, please delete it and any attachments immediately and advise the sender by return email or telephone. Deakin University does not warrant that this email and any attachments are error or virus free.

3 years, 11 months

4
4
0 0

2024

2023

2022

2021

2020

2019

ceph-users March 2020