April 2021 - ceph-users - lists.ceph.io

by Seba chanel

Hi everyone, I try to configure HA service for rgw with cephadm. I have 2 rgw on cnrgw1 et cnrgw2 for the same pool. i use a virtual IP address 192.168.0.15 cnrgwha and the config from https://docs.ceph.com/en/latest/cephadm/rgw/#high-availability-service-for-… # from root@cnrgw1 [root@cnrgw1 ~]# cat /etc/sysctl.conf net.ipv6.conf.all.disable_ipv6 = 1 net.ipv4.ip_forward = 1 net.ipv4.ip_nonlocal_bind = 1 [root@cnrgw1 ~]# sysctl -p net.ipv6.conf.all.disable_ipv6 = 1 net.ipv4.ip_forward = 1 net.ipv4.ip_nonlocal_bind = 1 #same from cnrgw2 #generate cert [vagrant@cn1 ~]# openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout ./rgwha.key -out ./rgwha.crt Generating a RSA private key .............+++++ ........................................................+++++ writing new private key to './rgwha.key' ----- You are about to be asked to enter information that will be incorporated into your certificate request. What you are about to enter is what is called a Distinguished Name or a DN. There are quite a few fields but you can leave some blank For some fields there will be a default value, If you enter '.', the field will be left blank. ----- Country Name (2 letter code) [XX]:fr State or Province Name (full name) []:est Locality Name (eg, city) [Default City]:sbg Organization Name (eg, company) [Default Company Ltd]:cephlab.org Organizational Unit Name (eg, section) []: Common Name (eg, your name or your server's hostname) []:cnrgwha Email Address []:root@localhost # write the YAML rgwha.yaml service_type: ha-rgw service_id: haproxy_for_rgw placement: hosts: - cnrgw1 - cnrgw2 spec: virtual_ip_interface: eth1 virtual_ip_address: 192.168.0.15/24 frontend_port: 8080 ha_proxy_port: 1967 ha_proxy_stats_enabled: true ha_proxy_stats_user: admin ha_proxy_stats_password: true ha_proxy_enable_prometheus_exporter: true ha_proxy_monitor_uri: /haproxy_health keepalived_user: admin keepalived_password: admin ha_proxy_frontend_ssl_certificate: [ "-----BEGIN CERTIFICATE-----", "MIICSzCCAfWgAwIBAgIUWKC9e+5tnIAjddECXOGc144p8E0wDQYJKoZIhvcNAQEL", "BQAwejELMAkGA1UEBhMCZnIxDDAKBgNVBAgMA2VzdDEMMAoGA1UEBwwDc2JnMRAw", "DgYDVQQKDAdjZXBobGFiMQwwCgYDVQQLDANvcmcxEDAOBgNVBAMMB2Nucmd3aGEx", "HTAbBgkqhkiG9w0BCQEWDnJvb3RAbG9jYWxob3N0MB4XDTIxMDMwOTE0MjI0N1oX", "DTIyMDMwOTE0MjI0N1owejELMAkGA1UEBhMCZnIxDDAKBgNVBAgMA2VzdDEMMAoG", "A1UEBwwDc2JnMRAwDgYDVQQKDAdjZXBobGFiMQwwCgYDVQQLDANvcmcxEDAOBgNV", "BAMMB2Nucmd3aGExHTAbBgkqhkiG9w0BCQEWDnJvb3RAbG9jYWxob3N0MFwwDQYJ", "KoZIhvcNAQEBBQADSwAwSAJBAMqji/AKBr6DbuHKOTWyIBWbeYkyZ7Jn7fqfZceE", "p7G321t1TvAjD7sa64FRT6n4x8CtzKPGXXpRr28o8oR1h70CAwEAAaNTMFEwHQYD", "VR0OBBYEFIQim5ZxojFny+srzQJIs1N8wLmYMB8GA1UdIwQYMBaAFIQim5ZxojFn", "y+srzQJIs1N8wLmYMA8GA1UdEwEB/wQFMAMBAf8wDQYJKoZIhvcNAQELBQADQQCE", "eCwMQFNYtw+4I1QzTV13ewawuPkPdrhiNzcs0mgt93+quE0zBIeOY2jnFmlo6H/h", "syYGvwgcAh9VW9qo5fsk", "-----END CERTIFICATE-----", "-----BEGIN PRIVATE KEY-----", "MIIBVQIBADANBgkqhkiG9w0BAQEFAASCAT8wggE7AgEAAkEAyqOL8AoGvoNu4co5", "NbIgFZt5iTJnsmft+p9lx4SnsbfbW3VO8CMPuxrrgVFPqfjHwK3Mo8ZdelGvbyjy", "hHWHvQIDAQABAkB0kt2AO+RhWS9CyZlb4JtAku66FLs/ETcAxQ5CV3g5beq8/wRs", "x3xZhIsjdr7OZZ+BEoJYn+0upywoctXmwM8BAiEA+KG26RADqJfAdoRn640UrT9E", "pfF3drDrQg0WrKAf3N0CIQDQpOZa0pV2GL28u2NaU85uJCDeKDWhTnvFEqlLu/S4", "YQIhAPY+0/WIUtdLVOcMxA/bLrtXihoASR1Yo+hLJkXaYTRRAiB3Rh1txD6vEXu+", "Hb2xUIGNE1g6x+/ItA4rXfysD9nZYQIhAKYn3IdG55JwiwSKv8gVAEdX8xiUfEjY", "pnvk3p52VHHI", "-----END PRIVATE KEY-----" ] ha_proxy_frontend_ssl_port: 8090 ha_proxy_ssl_dh_param: 1024 ha_proxy_ssl_ciphers: ECDH+AESGCM:!MD5 ha_proxy_ssl_options: no-sslv3 haproxy_container_image: haproxy:2.4-dev3-alpine keepalived_container_image: arcts/keepalived:1.2.2 # apply the new config [ceph: root@cn1 ~]# ceph orch apply -i rgwha.yaml Error EINVAL: ServiceSpec: __init__() got an unexpected keyword argument 'virtual_ip_interface' Do you have any leads why it doesn't work? [ceph: root@cn1 /]# ceph versions { "mon": { "ceph version 15.2.9 (357616cbf726abb779ca75a551e8d02568e15b17) octopus (stable)": 5 }, "mgr": { "ceph version 15.2.9 (357616cbf726abb779ca75a551e8d02568e15b17) octopus (stable)": 2 }, "osd": { "ceph version 15.2.9 (357616cbf726abb779ca75a551e8d02568e15b17) octopus (stable)": 8 }, "mds": {}, "rgw": { "ceph version 15.2.9 (357616cbf726abb779ca75a551e8d02568e15b17) octopus (stable)": 2 }, "overall": { "ceph version 15.2.9 (357616cbf726abb779ca75a551e8d02568e15b17) octopus (stable)": 17 } }

3 years

2
2
0 0

Nautilus: rgw_max_chunk_size = 4M?

by Dave Hall

Hello, Pardon if this has been asked, but I'm just getting started with Rados Gateway. I looked around for some hints about performance tuning and found a reference to setting rgw_max_chunk_size = 4M. I suspect the material was written during Jewel or earlier, so I'm wondering about the best practices for Nautilus? 1) I couldn't find how to set this in Nautilus. 2) I found a mailing list post from August 2019 that talked about EC pools and using a multiple of k * 4M. Any insight, or a pointer to the right part of the docs would be greatly appreciated. Thanks. -Dave -- Dave Hall Binghamton University kdhall(a)binghamton.edu

3 years

1
0
0 0

which is definitive: /var/lib/ceph symlinks or ceph-volume?

by Philip Brown

I am in a situation where I see conflicting information. On the one hand, ls -l /var/lib/ceph/osd/ceph-7 shows a symlink for block device, but no block.db On the other hand, ceph-volume lvm list claims that there is a separate db device registered for osd 7 how can I know which one is correct? (This is currently ceph nautilus) -- Philip Brown| Sr. Linux System Administrator | Medata, Inc. 5 Peters Canyon Rd Suite 250 Irvine CA 92606 Office 714.918.1310| Fax 714.918.1325 pbrown(a)medata.com| www.medata.com

3 years

2
5
0 0

Increase of osd space usage on cephfs heavy load

by Olivier AUDRY

hello when I run my borgbackup over cephfs volume (10 subvolumes for 1.5To) I can see a big increase of osd space usage and 2 or 3 osd goes near full, or full, then out and finally the cluster goes in error. Any tips to prevent this ? My cluster is cephv15 with : 9 nodes : each node run : 2x6to hdd and 2x600to ssd the cephfs got data on hdd and metadata on ssd. the cephfs md cache is : 32Go 128pg for data and metadata (this is has been setup by auto balancer) Perhaps I can fix the pg num for each of cephfs pool and prevent autobalancer to run for them. what do you think ? thx you for your help and advices. UPDATE : I increase the pg number to 256 for data and 1024 for metadata Here the df during the backup started since 30min POOL ID STORED OBJECTS USED %USED MAX AVAIL cephfs-metadata 12 183 GiB 514.68k 550 GiB 7.16 2.3 TiB Before the backup the stored was 20GiB oau

3 years

2
3
0 0

Changing IP addresses

by Jean-Marc FONTANA

Hello everyone, We have installed a Nautilus Ceph cluster with 3 monitors, 5 osd and 1 RGW gateway. It works but now, we need to change the IP addresses of these machines to put them in DMZ. Are there any recommandations to go about doing this ? Best regards,

3 years

2
1
1 0

Problem using advanced OSD layout in octopus

by Gary Molenkamp

Before I upgrade our Nautilus ceph to Octopus, I would like to make sure I am able to replace existing OSDs when they fail. However, I have not been able to create an OSD in Octopus with the layout we are using in Nautilus. I am testing this on a VM cluster so as not to touch any production systems. Our existing servers partition one SSD as block_db for 4 HDD OSDs. ie: OST.0 block: /dev/sda, block.db: /dev/sdn1 OST.1 block: /dev/sdb, block.db: /dev/sdn2 OST.2 block: /dev/sdc, block.db: /dev/sdn3 OST.3 block: /dev/sdd, block.db: /dev/sdn4 etc. My understanding is that I will need to apply an advanced osd service to achieve this layout. As each server is is similar the above, I could create 4 services (osd.disk0 - osd.disk3) and apply it to each host. I tried something similar to this: service_type: osd service_id: disk0 placement: host_pattern: 'storage*' data_devices: paths: - /dev/sda db_devices: paths: - /dev/sdn1 But the yaml was rejected with "Exception: Failed to validate Drive Group: `paths` is only allowed for data_devices", although it appears to be valid in the data structures here: https://docs.ceph.com/en/latest/cephadm/osd/#deploy-osds https://people.redhat.com/bhubbard/nature/default/mgr/orchestrator_modules/ I tried to use a combination of size and db_slots for the db_device, but I could not get the OSD to put the block.db on the separate device. Is this possible using the advanced placement method to orchestration, or should I just focus on using "cephadm ceph-volume" to create the OSDs in the desired layout? NOTE: I am trying to avoid installing ceph directly on the host OS (to use ceph-volume) as I do like the containerized approach in octopus. Thank you, Gary -- Gary Molenkamp Computer Science/Science Technology Services Systems Administrator University of Western Ontario molenkam(a)uwo.ca http://www.csd.uwo.ca (519) 661-2111 x86882 (519) 661-3566

3 years

2
2
0 0

bug in ceph-volume create

by Philip Brown

I would file this as a potential bug.. but it takes too long to get approved, and tracker.ceph.com doesnt have straightfoward google signin enabled :-/ I believe that with the new lvm mandate, ceph-volume should not be complaining about "missing PARTUUID". This is stopping me from using my system. Details on how to recreate: 1. have a system with 1 SSD and multiple HDDS 2. create a buncha OSDs with your preferred frontend, which will eventualy come down to ceph-volume lvm batch --bluestore /dev/ssddevice /dev/sdA ... /dev/sdX THIS will work great. batch mode will appropriately carve up the SSD device into multiple LVMs, and allocate one of them to be a DB device for each of the HDDs. 3. try to repair/replace an HDD As soon as you have an HDD fail... you will need to recreate the OSD.. and you are then stuck. Because you cant use batch mode for it... and you cant do it more granularly, with ceph-volume --cluster ceph lvm create --bluestore --data /dev/sdg --block.db /dev/ceph-xx-xx-xx/ceph-osd-db-this-is-the-old-lvm-for-ssd here because ceph-volume will complain that, blkid could not detect a PARTUUID for device: /dev/ceph-xx-xx-xx/ceph-osd-db-this-is-the-old-lvm-for-ssd here but the lvm IS NOT SUPPOSED TO HAVE A PARTUUID. Which is provable first all by the fact that it isnt a partition. But secondly, that none of the other block-db LVMs it created on the SSD in batch mode, have an partuuid either!! So kindly quit checking for something that isnt supposed to be there in the first place?! (This is a bug all the way back in nautilus, through latest, I believe) -- Philip Brown| Sr. Linux System Administrator | Medata, Inc. 5 Peters Canyon Rd Suite 250 Irvine CA 92606 Office 714.918.1310| Fax 714.918.1325 pbrown(a)medata.com| www.medata.com

3 years

3
3
0 0

mkfs.xfs -f /dev/rbd0 hangs

by 展荣臻（信泰）

hi,all: I use 15.2.10 ceph cluster with ubuntu 18.04.I create a rbd device and map into a host（ubuntu 18.04 ceph 15.2.10）, When i type mkfs.fs -f /dev/rbd0 it is hangs,But it is ok that write data use "rados -p {poolname} put obj myfile". Has anyone encountered this kind of problem?

3 years

1
0
0 0

What is the upper limit of the numer of PGs in a ceph cluster

by huxiaoyu＠horebdata.cn

Dear Cepher, My client has 10 volume, each volume was assigned 8192 PGs, in total 81920 PGs. The ceph is with Luminous Bluestore. During a power outage, the cluster restarted, and we observed that OSD peering consumed a lot of CPU and memory resources, evne leading to some OSD flappings. My question is thus, 1) how to speed up OSD peering and avoid OSD flappings when there are a lot PGs in the Ceph cluster, and 2) Is there a practical limit on the number of PGs for a single cluster? best regards, Samuel huxiaoyu(a)horebdata.cn

3 years

1
0
0 0

RGW: Corrupted Bucket index with nautilus 14.2.16

by by morphin

Good morning. I have a bucket and it has 50M object in it. The bucket created with multisite sync and that is the masterzone and only zone now. After a health check, I saw weird objects in pending attr state. I've tried to remove them with "radosgw-admin object rm --bypas-gc" but I coldn't delete them. After sending delete command I see object attributes changing to pending state but it stays like that. Also I've tried rados rm and the object data is deleted and can't see with radosgw-admin stat. After that I tried to copy the object with rclone "source to dest" But I can not write the object in same path! Rclone gives error but the object is created anyway and I see 0 byte object with rados and radosgw-admin again! After that I've tried to copy the problematic object from /samebucket/objectpath/theObject --> /samebucket/theObject and the 0 byte object copied with success and its not 0 byte anymore! It looks like ok! Then after all these I wonder if its an index issue? I've run radosgw-admin bi list and I saw 29 entry for same object (I guess all these retries left most of them behind) Also I run bi list for a normal object and I see only 5 entry. (Versions). There is bi list for problematic object: https://paste.ubuntu.com/p/jRtH3cMC94/ I'm too scared of checking or re-creating with bucket index. Is there anyone have seen the same issue?

3 years

1
0
0 0

2024

2023

2022

2021

2020

2019

ceph-users April 2021