Hi everyone,
I try to configure HA service for rgw with cephadm. I have 2 rgw on cnrgw1
et cnrgw2 for the same pool.
i use a virtual IP address 192.168.0.15 cnrgwha and the config from
https://docs.ceph.com/en/latest/cephadm/rgw/#high-availability-service-for-…
# from root@cnrgw1
[root@cnrgw1 ~]# cat /etc/sysctl.conf
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv4.ip_forward = 1
net.ipv4.ip_nonlocal_bind = 1
[root@cnrgw1 ~]# sysctl -p
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv4.ip_forward = 1
net.ipv4.ip_nonlocal_bind = 1
#same from cnrgw2
#generate cert
[vagrant@cn1 ~]# openssl req -x509 -nodes -days 365 -newkey rsa:2048
-keyout ./rgwha.key -out ./rgwha.crt
Generating a RSA private key
.............+++++
........................................................+++++
writing new private key to './rgwha.key'
-----
You are about to be asked to enter information that will be incorporated
into your certificate request.
What you are about to enter is what is called a Distinguished Name or a DN.
There are quite a few fields but you can leave some blank
For some fields there will be a default value,
If you enter '.', the field will be left blank.
-----
Country Name (2 letter code) [XX]:fr
State or Province Name (full name) []:est
Locality Name (eg, city) [Default City]:sbg
Organization Name (eg, company) [Default Company Ltd]:cephlab.org
Organizational Unit Name (eg, section) []:
Common Name (eg, your name or your server's hostname) []:cnrgwha
Email Address []:root@localhost
# write the YAML rgwha.yaml
service_type: ha-rgw
service_id: haproxy_for_rgw
placement:
hosts:
- cnrgw1
- cnrgw2
spec:
virtual_ip_interface: eth1
virtual_ip_address: 192.168.0.15/24
frontend_port: 8080
ha_proxy_port: 1967
ha_proxy_stats_enabled: true
ha_proxy_stats_user: admin
ha_proxy_stats_password: true
ha_proxy_enable_prometheus_exporter: true
ha_proxy_monitor_uri: /haproxy_health
keepalived_user: admin
keepalived_password: admin
ha_proxy_frontend_ssl_certificate:
[
"-----BEGIN CERTIFICATE-----",
"MIICSzCCAfWgAwIBAgIUWKC9e+5tnIAjddECXOGc144p8E0wDQYJKoZIhvcNAQEL",
"BQAwejELMAkGA1UEBhMCZnIxDDAKBgNVBAgMA2VzdDEMMAoGA1UEBwwDc2JnMRAw",
"DgYDVQQKDAdjZXBobGFiMQwwCgYDVQQLDANvcmcxEDAOBgNVBAMMB2Nucmd3aGEx",
"HTAbBgkqhkiG9w0BCQEWDnJvb3RAbG9jYWxob3N0MB4XDTIxMDMwOTE0MjI0N1oX",
"DTIyMDMwOTE0MjI0N1owejELMAkGA1UEBhMCZnIxDDAKBgNVBAgMA2VzdDEMMAoG",
"A1UEBwwDc2JnMRAwDgYDVQQKDAdjZXBobGFiMQwwCgYDVQQLDANvcmcxEDAOBgNV",
"BAMMB2Nucmd3aGExHTAbBgkqhkiG9w0BCQEWDnJvb3RAbG9jYWxob3N0MFwwDQYJ",
"KoZIhvcNAQEBBQADSwAwSAJBAMqji/AKBr6DbuHKOTWyIBWbeYkyZ7Jn7fqfZceE",
"p7G321t1TvAjD7sa64FRT6n4x8CtzKPGXXpRr28o8oR1h70CAwEAAaNTMFEwHQYD",
"VR0OBBYEFIQim5ZxojFny+srzQJIs1N8wLmYMB8GA1UdIwQYMBaAFIQim5ZxojFn",
"y+srzQJIs1N8wLmYMA8GA1UdEwEB/wQFMAMBAf8wDQYJKoZIhvcNAQELBQADQQCE",
"eCwMQFNYtw+4I1QzTV13ewawuPkPdrhiNzcs0mgt93+quE0zBIeOY2jnFmlo6H/h",
"syYGvwgcAh9VW9qo5fsk",
"-----END CERTIFICATE-----",
"-----BEGIN PRIVATE KEY-----",
"MIIBVQIBADANBgkqhkiG9w0BAQEFAASCAT8wggE7AgEAAkEAyqOL8AoGvoNu4co5",
"NbIgFZt5iTJnsmft+p9lx4SnsbfbW3VO8CMPuxrrgVFPqfjHwK3Mo8ZdelGvbyjy",
"hHWHvQIDAQABAkB0kt2AO+RhWS9CyZlb4JtAku66FLs/ETcAxQ5CV3g5beq8/wRs",
"x3xZhIsjdr7OZZ+BEoJYn+0upywoctXmwM8BAiEA+KG26RADqJfAdoRn640UrT9E",
"pfF3drDrQg0WrKAf3N0CIQDQpOZa0pV2GL28u2NaU85uJCDeKDWhTnvFEqlLu/S4",
"YQIhAPY+0/WIUtdLVOcMxA/bLrtXihoASR1Yo+hLJkXaYTRRAiB3Rh1txD6vEXu+",
"Hb2xUIGNE1g6x+/ItA4rXfysD9nZYQIhAKYn3IdG55JwiwSKv8gVAEdX8xiUfEjY",
"pnvk3p52VHHI",
"-----END PRIVATE KEY-----"
]
ha_proxy_frontend_ssl_port: 8090
ha_proxy_ssl_dh_param: 1024
ha_proxy_ssl_ciphers: ECDH+AESGCM:!MD5
ha_proxy_ssl_options: no-sslv3
haproxy_container_image: haproxy:2.4-dev3-alpine
keepalived_container_image: arcts/keepalived:1.2.2
# apply the new config
[ceph: root@cn1 ~]# ceph orch apply -i rgwha.yaml
Error EINVAL: ServiceSpec: __init__() got an unexpected keyword argument
'virtual_ip_interface'
Do you have any leads why it doesn't work?
[ceph: root@cn1 /]# ceph versions
{
"mon": {
"ceph version 15.2.9 (357616cbf726abb779ca75a551e8d02568e15b17)
octopus (stable)": 5
},
"mgr": {
"ceph version 15.2.9 (357616cbf726abb779ca75a551e8d02568e15b17)
octopus (stable)": 2
},
"osd": {
"ceph version 15.2.9 (357616cbf726abb779ca75a551e8d02568e15b17)
octopus (stable)": 8
},
"mds": {},
"rgw": {
"ceph version 15.2.9 (357616cbf726abb779ca75a551e8d02568e15b17)
octopus (stable)": 2
},
"overall": {
"ceph version 15.2.9 (357616cbf726abb779ca75a551e8d02568e15b17)
octopus (stable)": 17
}
}
Hello,
Pardon if this has been asked, but I'm just getting started with Rados
Gateway. I looked around for some hints about performance tuning and found
a reference to setting rgw_max_chunk_size = 4M. I suspect the material was
written during Jewel or earlier, so I'm wondering about the best practices
for Nautilus?
1) I couldn't find how to set this in Nautilus.
2) I found a mailing list post from August 2019 that talked about EC pools
and using a multiple of k * 4M.
Any insight, or a pointer to the right part of the docs would be greatly
appreciated.
Thanks.
-Dave
--
Dave Hall
Binghamton University
kdhall(a)binghamton.edu
I am in a situation where I see conflicting information.
On the one hand,
ls -l /var/lib/ceph/osd/ceph-7
shows a symlink for block device, but no block.db
On the other hand,
ceph-volume lvm list
claims that there is a separate db device registered for osd 7
how can I know which one is correct?
(This is currently ceph nautilus)
--
Philip Brown| Sr. Linux System Administrator | Medata, Inc.
5 Peters Canyon Rd Suite 250
Irvine CA 92606
Office 714.918.1310| Fax 714.918.1325
pbrown(a)medata.com| www.medata.com
hello
when I run my borgbackup over cephfs volume (10 subvolumes for 1.5To) I
can see a big increase of osd space usage and 2 or 3 osd goes near
full, or full, then out and finally the cluster goes in error.
Any tips to prevent this ?
My cluster is cephv15 with :
9 nodes :
each node run : 2x6to hdd and 2x600to ssd
the cephfs got data on hdd and metadata on ssd.
the cephfs md cache is : 32Go
128pg for data and metadata (this is has been setup by auto balancer)
Perhaps I can fix the pg num for each of cephfs pool and prevent
autobalancer to run for them.
what do you think ?
thx you for your help and advices.
UPDATE : I increase the pg number to 256 for data and 1024 for metadata
Here the df during the backup started since 30min
POOL ID STORED OBJECTS USED %USED MAX AVAIL
cephfs-metadata 12 183 GiB 514.68k 550 GiB 7.16 2.3 TiB
Before the backup the stored was 20GiB
oau
Hello everyone,
We have installed a Nautilus Ceph cluster with 3 monitors, 5 osd and 1
RGW gateway.
It works but now, we need to change the IP addresses of these machines
to put them in DMZ.
Are there any recommandations to go about doing this ?
Best regards,
Before I upgrade our Nautilus ceph to Octopus, I would like to make sure
I am able to replace existing OSDs when they fail. However, I have not
been able to create an OSD in Octopus with the layout we are using in
Nautilus. I am testing this on a VM cluster so as not to touch any
production systems.
Our existing servers partition one SSD as block_db for 4 HDD OSDs. ie:
OST.0 block: /dev/sda, block.db: /dev/sdn1
OST.1 block: /dev/sdb, block.db: /dev/sdn2
OST.2 block: /dev/sdc, block.db: /dev/sdn3
OST.3 block: /dev/sdd, block.db: /dev/sdn4
etc.
My understanding is that I will need to apply an advanced osd service to
achieve this layout. As each server is is similar the above, I could
create 4 services (osd.disk0 - osd.disk3) and apply it to each host. I
tried something similar to this:
service_type: osd
service_id: disk0
placement:
host_pattern: 'storage*'
data_devices:
paths:
- /dev/sda
db_devices:
paths:
- /dev/sdn1
But the yaml was rejected with "Exception: Failed to validate Drive
Group: `paths` is only allowed for data_devices", although it appears to
be valid in the data structures here:
https://docs.ceph.com/en/latest/cephadm/osd/#deploy-osdshttps://people.redhat.com/bhubbard/nature/default/mgr/orchestrator_modules/
I tried to use a combination of size and db_slots for the db_device, but
I could not get the OSD to put the block.db on the separate device. Is
this possible using the advanced placement method to orchestration, or
should I just focus on using "cephadm ceph-volume" to create the OSDs in
the desired layout?
NOTE: I am trying to avoid installing ceph directly on the host OS (to
use ceph-volume) as I do like the containerized approach in octopus.
Thank you,
Gary
--
Gary Molenkamp Computer Science/Science Technology Services
Systems Administrator University of Western Ontario
molenkam(a)uwo.ca http://www.csd.uwo.ca
(519) 661-2111 x86882 (519) 661-3566
I would file this as a potential bug.. but it takes too long to get approved, and tracker.ceph.com doesnt have straightfoward google signin enabled :-/
I believe that with the new lvm mandate, ceph-volume should not be complaining about "missing PARTUUID".
This is stopping me from using my system.
Details on how to recreate:
1. have a system with 1 SSD and multiple HDDS
2. create a buncha OSDs with your preferred frontend, which will eventualy come down to
ceph-volume lvm batch --bluestore /dev/ssddevice /dev/sdA ... /dev/sdX
THIS will work great. batch mode will appropriately carve up the SSD device into multiple LVMs, and allocate one of them to be a DB device for each of the HDDs.
3. try to repair/replace an HDD
As soon as you have an HDD fail... you will need to recreate the OSD.. and you are then stuck. Because you cant use batch mode for it...
and you cant do it more granularly, with
ceph-volume --cluster ceph lvm create --bluestore --data /dev/sdg --block.db /dev/ceph-xx-xx-xx/ceph-osd-db-this-is-the-old-lvm-for-ssd here
because ceph-volume will complain that,
blkid could not detect a PARTUUID for device: /dev/ceph-xx-xx-xx/ceph-osd-db-this-is-the-old-lvm-for-ssd here
but the lvm IS NOT SUPPOSED TO HAVE A PARTUUID.
Which is provable first all by the fact that it isnt a partition. But secondly, that none of the other block-db LVMs it created on the SSD in batch mode, have an partuuid either!!
So kindly quit checking for something that isnt supposed to be there in the first place?!
(This is a bug all the way back in nautilus, through latest, I believe)
--
Philip Brown| Sr. Linux System Administrator | Medata, Inc.
5 Peters Canyon Rd Suite 250
Irvine CA 92606
Office 714.918.1310| Fax 714.918.1325
pbrown(a)medata.com| www.medata.com
hi,all:
I use 15.2.10 ceph cluster with ubuntu 18.04.I create a rbd device and map into a host(ubuntu 18.04 ceph 15.2.10),
When i type mkfs.fs -f /dev/rbd0 it is hangs,But it is ok that write data use "rados -p {poolname} put obj myfile".
Has anyone encountered this kind of problem?
Dear Cepher,
My client has 10 volume, each volume was assigned 8192 PGs, in total 81920 PGs. The ceph is with Luminous Bluestore. During a power outage, the cluster restarted, and we observed that OSD peering consumed a lot of CPU and memory resources, evne leading to some OSD flappings.
My question is thus, 1) how to speed up OSD peering and avoid OSD flappings when there are a lot PGs in the Ceph cluster, and 2) Is there a practical limit on the number of PGs for a single cluster?
best regards,
Samuel
huxiaoyu(a)horebdata.cn
Good morning.
I have a bucket and it has 50M object in it. The bucket created with
multisite sync and that is the masterzone and only zone now.
After a health check, I saw weird objects in pending attr state.
I've tried to remove them with "radosgw-admin object rm --bypas-gc"
but I coldn't delete them. After sending delete command I see object
attributes changing to pending state but it stays like that.
Also I've tried rados rm and the object data is deleted and can't see
with radosgw-admin stat. After that I tried to copy the object with
rclone "source to dest" But I can not write the object in same path!
Rclone gives error but the object is created anyway and I see 0 byte
object with rados and radosgw-admin again!
After that I've tried to copy the problematic object from
/samebucket/objectpath/theObject --> /samebucket/theObject and the 0
byte object copied with success and its not 0 byte anymore! It looks
like ok!
Then after all these I wonder if its an index issue?
I've run radosgw-admin bi list and I saw 29 entry for same object (I
guess all these retries left most of them behind) Also I run bi list
for a normal object and I see only 5 entry. (Versions).
There is bi list for problematic object:
https://paste.ubuntu.com/p/jRtH3cMC94/
I'm too scared of checking or re-creating with bucket index.
Is there anyone have seen the same issue?