March 2020 - ceph-users - lists.ceph.io

by Robert LeBlanc

This is the second time this happened in a couple of weeks. The MDS locks up and the stand-by can't take over so the Montiors black list them. I try to unblack list them, but they still say this in the logs mds.0.1184394 waiting for osdmap 234947 (which blacklists prior instance) Looking at a pg dump, it looks like the epoch is passed that. $ ceph pg map 3.756 osdmap e234953 pg 3.756 (3.756) -> up [113,180,115] acting [113,180,115] Last time, it seemed to just recover after about an hour all by it's self. Any way to speed this up? Thank you, Robert LeBlanc ---------------- Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1

3 years, 11 months

3
4
0 0

CephFS with active-active NFS Ganesha

by Michael Bisig

Hi all, I am trying to setup an active-active NFS Ganesha cluster (with two Ganeshas (v3.0) running in Docker containers). I could manage to get two Ganesha daemons running using the rados_cluster backend for active-active deployment. I have the grace db within the cephfs metadata pool in an own namespace which keeps track on the node status. Now, I can mount the exposed filesystem over NFS (v4.1, v4.2) with both daemons. So far so good. __ Testing high availability resulted in an unexpected behavior for that I am not sure whether it is intentional or whether it is a configuration problem. Problem: If both are running, no E or N flags are set within the grace db, as I expect. Once, one host goes down (or is taken down) ALL clients cannot read nor write to the mounted filesystem, even the clients which are not connected to dead ganesha. In the db, I see that the dead ganesha has state NE and the active has E. This state is what I expect from the Ganesha documentation. Nevertheless, I would assume that the clients connected to the active daemon are not blocked. This state is not cleaned up by itself (e.g. after the grace period). I can unlock this situation by 'lifting' the dead node with a direct db call (using ganesha-rados-grace tool). But within an active-active deployment this is not suitable. The ganesha config looks like: ------------ NFS_CORE_PARAM { Enable_NLM = false; Protocols = 4; } NFSv4 { RecoveryBackend = rados_cluster; Minor_Versions = 1,2; } RADOS_KV { pool = "cephfsmetadata"; nodeid = "a" ; namespace = "grace"; UserId = "ganesha"; Ceph_Conf = "/etc/ceph/ceph.conf"; } MDCACHE { Dir_Chunk = 0; NParts = 1; Cache_Size = 1; } EXPORT { Export_ID=101; Protocols = 4; Transports = TCP; Path = PATH; Pseudo = PSEUDO_PATH; Access_Type = RW; Attr_Expiration_Time = 0; Squash = no_root_squash; FSAL { Name = CEPH; User_Id = "ganesha"; Secret_Access_Key = CEPHXKEY; } } LOG { Default_Log_Level = "FULL_DEBUG"; } ------------ Does anyone have similar problems? Or if this behavior is by purpose, can you explain to me why this is the case? Thank you in advance for your time and thoughts. Kind regards, Michael

3 years, 11 months

3
2
0 0

is ceph balancer doing anything?

by Andrei Mikhailovsky

Hello everyone, A few weeks ago I have enabled the ceph balancer on my cluster as per the instructions here: [ https://docs.ceph.com/docs/mimic/mgr/balancer/ | https://docs.ceph.com/docs/mimic/mgr/balancer/ ] I am running ceph version: ceph version 13.2.6 (7b695f835b03642f85998b2ae7b6dd093d9fbce4) mimic (stable) The cluster has 48 osds (40 osds in hdd pools and 8 osds in ssd pool) Currently, the balancer status is showing as Active. # ceph balancer status { "active": true, "plans": [], "mode": "upmap" } The health status of the cluster is: health: HEALTH_OK Previously, I've used the old REWEIGHT to change the placement of data as I've seen very uneven usage (ranging from about 60% usage on some OSDs to over 90% on others). So, I have a number of osds with reweight of 1 an some going down to 0.75. At the moment the osd usage ranges between about 65% to to just under 90%, so still a huge variation. After switching on the balancer, I've not actually seen any activity or data migration, so I am not sure if the balancer is working at all. Could someone tell me how do I check if balancing is doing its job? The second question is as the balancer is now switched on, do I suppose to set the reweight values back to their default value of 1? Many thanks

3 years, 12 months

2
1
0 0

How to migrate ceph-xattribs?

by Frank Schilder

De all, we are in the process of migrating a ceph file system from a 2-pool layout (rep meta+ec data) to the recently recommended 3-pool layout (rep meta, per primary data, ec data). As part of this, we need to migrate any ceph xattribs set on files and directories. As these are no longer discoverable, how would one go about this? Special cases: How to migrate quota settings? How to migrate dir- and file-layouts? Ideally, at least quota attributes should be transferable on the fly with tools like rsync. If automatic migration is not possible, is there at least an efficient way to *find* everything with special ceph attributes? Thanks and best regards, ================= Frank Schilder AIT Risø Campus Bygning 109, rum S14

4 years

3
5
0 0

HELP! Ceph( v 14.2.8) bucket notification dose not work!

by 曹海旺

Hi, I upgrade the ceph from 14.2.7 to the new version 14.2.8 . The bucket notification dose not work. I can’t create a TOPIC : I use post man to send a post flow by https://docs.ceph.com/docs/master/radosgw/notifications/#create-a-topic REQUEST: POST http://rgw1:7480/?Action=CreateTopic&Name=webno&push-endpoint=https://192.1… RESPONSE: <?xml version="1.0" encoding="UTF-8"?> <Error> <Code>MethodNotAllowed</Code> <RequestId>tx000000000000000000008-005e6a0eab-cbcad-bj</RequestId> <HostId>cbcad-bj-bjz</HostId> </Error> The debug info on the node below: 2020-03-12 18:49:24.684 7fdde1e1d700 1 ====== starting new request req=0x55c91a51e8f0 ===== 2020-03-12 18:49:24.684 7fdde1e1d700 2 req 14 0.000s initializing for trans_id = tx00000000000000000000e-005e6a13b4-cbc6a-bj 2020-03-12 18:49:24.684 7fdde1e1d700 10 rgw api priority: s3=8 s3website=7 2020-03-12 18:49:24.684 7fdde1e1d700 10 host=192.168.3.250 2020-03-12 18:49:24.684 7fdde1e1d700 20 subdomain= domain= in_hosted_domain=0 in_hosted_domain_s3website=0 2020-03-12 18:49:24.684 7fdde1e1d700 20 final domain/bucket subdomain= domain= in_hosted_domain=0 in_hosted_domain_s3website=0 s->info.domain= s->info.request_uri=/ 2020-03-12 18:49:24.684 7fdde1e1d700 10 meta>> HTTP_X_AMZ_CONTENT_SHA256 2020-03-12 18:49:24.684 7fdde1e1d700 10 meta>> HTTP_X_AMZ_DATE 2020-03-12 18:49:24.684 7fdde1e1d700 10 x>> x-amz-content-sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 2020-03-12 18:49:24.684 7fdde1e1d700 10 x>> x-amz-date:20200312T104924Z 2020-03-12 18:49:24.684 7fdde1e1d700 20 get_handler handler=26RGWHandler_REST_Service_S3 2020-03-12 18:49:24.684 7fdde1e1d700 10 handler=26RGWHandler_REST_Service_S3 2020-03-12 18:49:24.684 7fdde1e1d700 2 req 14 0.000s getting op 4 2020-03-12 18:49:24.684 7fdde1e1d700 20 handler->ERRORHANDLER: err_no=-2003 new_err_no=-2003 2020-03-12 18:49:24.684 7fdde1e1d700 2 req 14 0.000s http status=405 2020-03-12 18:49:24.684 7fdde1e1d700 1 ====== req done req=0x55c91a51e8f0 op status=0 http_status=405 latency=0s ====== 2020-03-12 18:49:25.502 7fde0fe79700 2 RGWDataChangesLog::ChangesRenewThread: start The same post works in the version 14.2.7 What is the correct way to create a Topic in version 14.2.8 ？

4 years

3
3
0 0

Re: MDS: obscene buffer_anon memory use when scanning lots of files

by John Madden

Upgraded to 14.2.7, doesn't appear to have affected the behavior. As requested: ~$ ceph tell mds.mds1 heap stats 2020-02-10 16:52:44.313 7fbda2cae700 0 client.59208005 ms_handle_reset on v2:x.x.x.x:6800/3372494505 2020-02-10 16:52:44.337 7fbda3cb0700 0 client.59249562 ms_handle_reset on v2:x.x.x.x:6800/3372494505 mds.mds1 tcmalloc heap stats:------------------------------------------------ MALLOC: 50000388656 (47684.1 MiB) Bytes in use by application MALLOC: + 0 ( 0.0 MiB) Bytes in page heap freelist MALLOC: + 174879528 ( 166.8 MiB) Bytes in central cache freelist MALLOC: + 14511680 ( 13.8 MiB) Bytes in transfer cache freelist MALLOC: + 14089320 ( 13.4 MiB) Bytes in thread cache freelists MALLOC: + 90534048 ( 86.3 MiB) Bytes in malloc metadata MALLOC: ------------ MALLOC: = 50294403232 (47964.5 MiB) Actual memory used (physical + swap) MALLOC: + 50987008 ( 48.6 MiB) Bytes released to OS (aka unmapped) MALLOC: ------------ MALLOC: = 50345390240 (48013.1 MiB) Virtual address space used MALLOC: MALLOC: 260018 Spans in use MALLOC: 20 Thread heaps in use MALLOC: 8192 Tcmalloc page size ------------------------------------------------ Call ReleaseFreeMemory() to release freelist memory to the OS (via madvise()). Bytes released to the OS take up virtual address space but no physical memory. ~$ ceph tell mds.mds1 heap release 2020-02-10 16:52:47.205 7f037eff5700 0 client.59249625 ms_handle_reset on v2:x.x.x.x:6800/3372494505 2020-02-10 16:52:47.237 7f037fff7700 0 client.59249634 ms_handle_reset on v2:x.x.x.x:6800/3372494505 mds.mds1 releasing free RAM back to system. The pools over 15 minutes or so: ~$ ceph daemon mds.mds1 dump_mempools | jq .mempool.by_pool.buffer_anon { "items": 2045, "bytes": 3069493686 } ~$ ceph daemon mds.mds1 dump_mempools | jq .mempool.by_pool.buffer_anon { "items": 2445, "bytes": 3111162538 } ~$ ceph daemon mds.mds1 dump_mempools | jq .mempool.by_pool.buffer_anon { "items": 7850, "bytes": 7658678767 } ~$ ceph daemon mds.mds1 dump_mempools | jq .mempool.by_pool.buffer_anon { "items": 12274, "bytes": 11436728978 } ~$ ceph daemon mds.mds1 dump_mempools | jq .mempool.by_pool.buffer_anon { "items": 13747, "bytes": 11539478519 } ~$ ceph daemon mds.mds1 dump_mempools | jq .mempool.by_pool.buffer_anon { "items": 14615, "bytes": 13859676992 } ~$ ceph daemon mds.mds1 dump_mempools | jq .mempool.by_pool.buffer_anon { "items": 23267, "bytes": 22290063830 } ~$ ceph daemon mds.mds1 dump_mempools | jq .mempool.by_pool.buffer_anon { "items": 44944, "bytes": 40726959425 } And one about a minute after the heap release showing continued growth: ~$ ceph daemon mds.mds1 dump_mempools | jq .mempool.by_pool.buffer_anon { "items": 50694, "bytes": 47343942094 } This is on a single active MDS with 2 standbys, scan for about a million files with about 20 parallel threads on two clients, open and read each if it exists. On Wed, Jan 22, 2020 at 8:25 AM John Madden <jmadden.com(a)gmail.com> wrote: > > > Couldn't John confirm that this is the issue by checking the heap stats and triggering the release via > > > > ceph tell mds.mds1 heap stats > > ceph tell mds.mds1 heap release > > > > (this would be much less disruptive than restarting the MDS) > > That was my first thought as well, but `release` doesn't appear to do > anything in this case. > > John

4 years

3
8
0 0

Fw: Incompatibilities (implicit_tenants & barbican) with Openstack after migrating from Ceph Luminous to Nautilus.

by Scheurer François

(resending to the new maillist) Dear Casey, Dear All, We tested the migration from Luminous to Nautilus and noticed two regressions breaking the RGW integration in Openstack: 1) the following config parameter is not working on Nautilus but is valid on Luminous and on Master: rgw_keystone_implicit_tenants = swift In the log: parse error setting 'rgw_keystone_implicit_tenants' to 'swift' (Expected option value to be integer, got 'swift') This param is important to make RGW working for S3 and Swift. Setting it to false breaks swift/openstack and setting it to true makes S3 incompatible with dns-style bucketnames (with shared or public access). Please note that path-style bucketnames are deprecated by AWS and most clients are only supporting dns-style... Ref.: https://tracker.ceph.com/issues/24348 https://github.com/ceph/ceph/commit/3ba7be8d1ac7ee43e69eebb58263cd080cca1d38 2) the server-side encryption (SSE-KMS) is broken on Nautilus: to reproduce the issue: s3cmd --access_key $ACCESSKEY --secret_key $SECRETKEY --host-bucket "%(bucket)s.$ENDPOINT" --host "$ENDPOINT" --region="$REGION" --signature-v2 --no-preserve --no-ssl --server-side-encryption --server-side-encryption-kms-id ${SECRET##*/} put helloenc.txt s3://testenc/ output: upload: 'helloenc.txt' -> 's3://testenc/helloenc.txt' [1 of 1] 9 of 9 100% in 0s 37.50 B/s done ERROR: S3 error: 403 (AccessDenied): Failed to retrieve the actual key, kms-keyid: cd0903db-c613-49be-96d9-165c02544bc7 rgw log: see below TLDR: after investigating, I found that radosgw was actually getting the barbican secret correctly but the HTTP CODE (=200) validation was failing because of a bug in Nautilus. My understanding is following (please correct me): The bug in src/rgw/rgw_http_client.cc . Since Nautilus HTTP_CODE are converted into ERROR_CODE (200 becomes 0) in the request processing. This happens in RGWHTTPManager::reqs_thread_entry(), which centralizes the processing of (curl) HTTP Requests with multi-treading. This is fine but the member variable http_status of the class RGWHTTPClient is not updated with the resulting HTTP CODE, so the variable keeps its initial value of 0. Then in src/rgw/rgw_crypt.cc the logic is still verifying that http_status is in range [200,299] and this fails... I wrote the following oneliner bugfix for src/rgw/rgw_http_client.cc: diff --git a/src/rgw/rgw_http_client.cc b/src/rgw/rgw_http_client.cc index d0f0baead6..7c115293ad 100644 --- a/src/rgw/rgw_http_client.cc +++ b/src/rgw/rgw_http_client.cc @@ -1146,6 +1146,7 @@ void *RGWHTTPManager::reqs_thread_entry() status = -EAGAIN; } int id = req_data->id; + req_data->client->http_status = http_status; finish_request(req_data, status); switch (result) { case CURLE_OK: The s3cmd is then working fine with KMS server side encryption. Questions: * Could someone please write a fix for the regression of 1) and make a PR ? * Could somebody also make a PR for 2? Thank you for your help. :-) Cheers Francois Scheurer rgw log: export CLUSTER=ceph; /home/local/ceph/build/bin/radosgw -f --cluster ${CLUSTER} --name client.rgw.$(hostname) --setuser ceph --setgroup ceph & tail -fn0 /var/log/ceph/ceph-client.rgw.ewos1-osd1-stage.log | less -IS 2020-02-26 16:32:59.208 7fc1f1c54700 20 Getting KMS encryption key for key=cd0903db-c613-49be-96d9-165c02544bc7 2020-02-26 16:32:59.208 7fc1f1c54700 20 Requesting secret from barbican url=http://keystone.service.stage.i.ewcs.ch:5000/v3/auth/tokens 2020-02-26 16:32:59.208 7fc1f1c54700 20 ewdebug: RGWHTTPClient::process: http_status: 0 2020-02-26 16:32:59.208 7fc1f1c54700 20 ewdebug: RGWHTTP::process 2020-02-26 16:32:59.208 7fc1f1c54700 20 ewdebug: RGWHTTP::send 2020-02-26 16:32:59.208 7fc1f1c54700 20 sending request to http://keystone.service.stage.i.ewcs.ch:5000/v3/auth/tokens 2020-02-26 16:32:59.208 7fc1f1c54700 20 ssl verification is set to off 2020-02-26 16:32:59.208 7fc1f1c54700 20 ewdebug: RGWHTTPManager::add_request: client->init_request(req_data): 0 2020-02-26 16:32:59.208 7fc1f1c54700 20 register_request mgr=0x56374b865540 req_data->id=4, curl_handle=0x56374c77c4a0 2020-02-26 16:32:59.208 7fc1f1c54700 20 ewdebug: RGWHTTPManager::signal_thread(): write(thread_pipe[1], (void *)&buf, sizeof(buf)): 4 2020-02-26 16:32:59.208 7fc1f1c54700 20 ewdebug: RGWHTTPManager::add_request: signal_thread(): 0 2020-02-26 16:32:59.208 7fc1f1c54700 20 ewdebug: RGWHTTP::send: rgw_http_manager->add_request(req): 0 2020-02-26 16:32:59.208 7fc1f1c54700 20 ewdebug: RGWHTTP::process: send(req): 0 2020-02-26 16:32:59.208 7fc1f1c54700 20 ewdebug: struct rgw_http_req_data : public RefCountedObject : int wait() : ret: 0 2020-02-26 16:32:59.208 7fc2184a1700 20 link_request req_data=0x56374c96a240 req_data->id=4, curl_handle=0x56374c77c4a0 2020-02-26 16:32:59.608 7fc2184a1700 20 ewdebug: RGWHTTPManager::reqs_thread_entry: http_status: 201 2020-02-26 16:32:59.608 7fc2184a1700 20 ewdebug: RGWHTTPManager::reqs_thread_entry: rgw_http_error_to_errno(http_status): 0 2020-02-26 16:32:59.608 7fc2184a1700 20 ewdebug: RGWHTTPManager::reqs_thread_entry: finish_request(req_data, status): status: 0 2020-02-26 16:32:59.608 7fc2184a1700 20 ewdebug: struct rgw_http_req_data : public RefCountedObject : void finish(int r) : ret: 0 2020-02-26 16:32:59.652 7fc1f1c54700 5 ewdebug: request_key_from_barbican: Accept application/octet-stream X-Auth-Token gAAAAABeVo-xxx 2020-02-26 16:32:59.652 7fc1f1c54700 20 ewdebug: RGWHTTPClient::process: http_status: 0 2020-02-26 16:32:59.652 7fc1f1c54700 20 ewdebug: RGWHTTP::process 2020-02-26 16:32:59.652 7fc1f1c54700 20 ewdebug: RGWHTTP::send 2020-02-26 16:32:59.652 7fc1f1c54700 20 sending request to http://barbican.service.stage.i.ewcs.ch:9311/v1/secrets/cd0903db-c613-49be-… 2020-02-26 16:32:59.652 7fc1f1c54700 20 ewdebug: RGWHTTPManager::add_request: client->init_request(req_data): 0 2020-02-26 16:32:59.652 7fc1f1c54700 20 register_request mgr=0x56374b865540 req_data->id=5, curl_handle=0x56374c77c4a0 2020-02-26 16:32:59.652 7fc1f1c54700 20 ewdebug: RGWHTTPManager::signal_thread(): write(thread_pipe[1], (void *)&buf, sizeof(buf)): 4 2020-02-26 16:32:59.652 7fc1f1c54700 20 ewdebug: RGWHTTPManager::add_request: signal_thread(): 0 2020-02-26 16:32:59.652 7fc1f1c54700 20 ewdebug: RGWHTTP::send: rgw_http_manager->add_request(req): 0 2020-02-26 16:32:59.652 7fc1f1c54700 20 ewdebug: RGWHTTP::process: send(req): 0 2020-02-26 16:32:59.652 7fc1f1c54700 20 ewdebug: struct rgw_http_req_data : public RefCountedObject : int wait() : ret: 0 2020-02-26 16:32:59.652 7fc2184a1700 20 link_request req_data=0x56374c96a240 req_data->id=5, curl_handle=0x56374c77c4a0 => 2020-02-26 16:32:59.752 7fc2184a1700 20 ewdebug: RGWHTTPManager::reqs_thread_entry: http_status: 200 2020-02-26 16:32:59.752 7fc2184a1700 20 ewdebug: RGWHTTPManager::reqs_thread_entry: rgw_http_error_to_errno(http_status): 0 2020-02-26 16:32:59.752 7fc2184a1700 20 ewdebug: RGWHTTPManager::reqs_thread_entry: finish_request(req_data, status): status: 0 2020-02-26 16:32:59.752 7fc2184a1700 20 ewdebug: struct rgw_http_req_data : public RefCountedObject : void finish(int r) : ret: 0 2020-02-26 16:32:59.752 7fc1f1c54700 5 ewdebug: request_key_from_barbican: secret_req.process: 0 => 2020-02-26 16:32:59.752 7fc1f1c54700 5 ewdebug: request_key_from_barbican: secret_req.get_http_status: 0 2020-02-26 16:32:59.752 7fc1f1c54700 5 ewdebug: request_key_from_barbican: secret_req.get_http_status not in [200,299] range! 2020-02-26 16:32:59.752 7fc1f1c54700 5 Failed to retrieve secret from barbican:cd0903db-c613-49be-96d9-165c02544bc7 2020-02-26 16:32:59.752 7fc1f1c54700 5 ERROR: failed to retrieve actual key from key_id: cd0903db-c613-49be-96d9-165c02544bc7 2020-02-26 16:32:59.752 7fc1f1c54700 2 req 1 1.092s s3:put_obj completing 2020-02-26 16:32:59.752 7fc1f1c54700 2 req 1 1.092s s3:put_obj op status=-13 2020-02-26 16:32:59.752 7fc1f1c54700 2 req 1 1.092s s3:put_obj http status=403 2020-02-26 16:32:59.752 7fc1f1c54700 1 ====== req done req=0x56374c9808d0 op status=-13 http_status=403 latency=1.092s ====== => we see that http_status is correct (200) but the variable secret_req.get_http_status (member of class RGWHTTPClient) is incorrect (0 instead of 200)

4 years

2
4
0 0

ceph-mgr Module "zabbix" cannot send Data

by i.schmidt＠langeoog.de

Hi Folks We are using Ceph as our storage backend on our 6 Node Proxmox VM Cluster. To Monitor our systems we use Zabbix and i would like to get some Ceph Data into our Zabbix to get some alarms when something goes wrong. Ceph mgr has a module, "zabbix" that uses "zabbix-sender" to actively send data, but i cannot get the module working. It always responds with "failed to send data" The network side seems to be fine: root@vm-2:~# traceroute 192.168.15.253 traceroute to 192.168.15.253 (192.168.15.253), 30 hops max, 60 byte packets 1 192.168.15.253 (192.168.15.253) 0.411 ms 0.402 ms 0.393 ms root@vm-2:~# nmap -p 10051 192.168.15.253 Starting Nmap 7.70 ( https://nmap.org ) at 2019-09-18 08:40 CEST Nmap scan report for 192.168.15.253 Host is up (0.00026s latency). PORT STATE SERVICE 10051/tcp open zabbix-trapper MAC Address: BA:F5:30:EF:40:EF (Unknown) Nmap done: 1 IP address (1 host up) scanned in 0.61 seconds root@vm-2:~# ceph zabbix config-show {"zabbix_port": 10051, "zabbix_host": "192.168.15.253", "identifier": "VM-2", "zabbix_sender": "/usr/bin/zabbix_sender", "interval": 60} root@vm-2:~# But if i try "ceph zabbix send" i get "failed to send data to zabbix" and this show up in the systems journal: Sep 18 08:41:13 vm-2 ceph-mgr[54445]: 2019-09-18 08:41:13.272 7fe360fe4700 -1 mgr.server reply reply (1) Operation not permitted The log of ceph-mgr on that machine states: 2019-09-18 08:42:18.188 7fe359fd6700 0 mgr[zabbix] Exception when sending: /usr/bin/zabbix_sender exited non-zero: zabbix_sender [3253392]: DEBUG: answer [{"response":"success","info":"processed: 0; failed: 44; total: 44; seconds spent: 0.000179"}] 2019-09-18 08:43:18.217 7fe359fd6700 0 mgr[zabbix] Exception when sending: /usr/bin/zabbix_sender exited non-zero: zabbix_sender [3253629]: DEBUG: answer [{"response":"success","info":"processed: 0; failed: 44; total: 44; seconds spent: 0.000321"}] I'm guessing, this could have something to do with user rights. But i have no idea where to start to track this down. Maybe someone here has a hint? If more information is needed, i will gladly provide it. greetings Ingo

4 years

5
6
0 0

Load on drives of different sizes in ceph

by Andras Pataki

Hi cephers, I'm looking for some advice on what to do about drives of different sizes in the same cluster. We have so far kept the drive sizes consistent on our main ceph cluster (using 8TB drives). We're getting some new hardware with larger, 12TB drives next, and I'm pondering on how best to configure them. If I just simply add them, they will have 1.5x the data (which is less of a problem), but will also get 1.5x the iops - so I presume it will slow the whole cluster down as a result (these drives will be busy, and the rest will not be as much). I'm wondering how people generally handle this. I'm more concerned about these larger drives being busier than the rest - so I'd like to be able to put for example 1/3 drive of less accessed data on them in addition to the usual data - to use the extra capacity but not increase the load on them. Is there an easy way to accomplish this? One possibility is to run two OSDs on the drive (in two crush hierarchies), which isn't ideal. Can I just run one OSD somehow and put it into two crush roots, or something similar? Andras

4 years

4
4
0 0

rgw multisite with https endpoints

by Richard Kearsley

Hi there I have a fairly simple ceph multisite configuration with 2 ceph clusters in 2 different datacenters in the same city The rgws have this config for ssl: rgw_frontends = civetweb port=7480+443s ssl_certificate=/opt/ssl/ceph-bundle.pem The certificate is a real issued certificate, not self signed I configured the multisite with the guide from https://docs.ceph.com/docs/nautilus/radosgw/multisite/ More or less ok so far, some learning curve but that's ok I can access and upload to buckets at both endpoints with s3 client using https - https://ceph01cs1.domain.com and https://ceph01cs2.domain.com - all good Now the problem seems to be when my zones in the zonegroup use https endpoints, e.g. { "id": "4c6774fb-01eb-41fe-a74a-c2693f8e69fc", "name": "eu", "api_name": "eu", "is_master": "true", "endpoints": [ "https://ceph01cs1.domain.com:443" ], "hostnames": [], "hostnames_s3website": [], "master_zone": "0c203df2-6f31-4ad1-a899-91f85bf34c4e", "zones": [ { "id": "0c203df2-6f31-4ad1-a899-91f85bf34c4e", "name": "ceph01cs1", "endpoints": [ "https://ceph01cs1.domain.com:443" ], "log_meta": "false", "log_data": "true", "bucket_index_max_shards": 0, "read_only": "false", "tier_type": "", "sync_from_all": "true", "sync_from": [], "redirect_zone": "" }, { "id": "fec1fec8-a3c1-454d-8ed2-2c1da45f9c33", "name": "ceph01cs2", "endpoints": [ "https://ceph01cs2.domain.com:443" ], "log_meta": "false", "log_data": "true", "bucket_index_max_shards": 0, "read_only": "false", "tier_type": "", "sync_from_all": "true", "sync_from": [], "redirect_zone": "" } ], "placement_targets": [ { "name": "default-placement", "tags": [], "storage_classes": [ "STANDARD" ] } ], "default_placement": "default-placement", "realm_id": "08921dd5-1523-41b6-908f-2f58aa38c969" } Meta syncs ok - buckets and users get created, but data doesn't, and period can be commited and appears on both clusters I can also curl between the two clusters over 443 However, data sync gets stuck on 'init': realm 08921dd5-1523-41b6-908f-2f58aa38c969 (world) zonegroup 4c6774fb-01eb-41fe-a74a-c2693f8e69fc (eu) zone 0c203df2-6f31-4ad1-a899-91f85bf34c4e (ceph01cs2) metadata sync no sync (zone is master) data sync source: fec1fec8-a3c1-454d-8ed2-2c1da45f9c33 (ceph01cs1) init full sync: 128/128 shards full sync: 0 buckets to sync incremental sync: 0/128 shards data is behind on 128 shards behind shards: [0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127] I find errors like: 2020-03-31 20:27:11.372 7f60c84e1700 0 RGW-SYNC:data:sync: ERROR: failed to init sync, retcode=-16 2020-03-31 20:27:29.548 7f60c84e1700 0 RGW-SYNC:data:sync:init_data_sync_status: ERROR: failed to read remote data log shards 2020-03-31 20:29:48.499 7f60c94e3700 0 RGW-SYNC:meta: ERROR: failed to fetch all metadata keys If I change the endpoints in the zonegroup to plain http, e.g. http://ceph01cs1.domain.com:7480 and http://ceph01cs2.domain.com:7480 then sync starts! So my question, and I couldn't find any examples of people using https to sync.. are https endpoints supported with multisite? and why would meta work over https but not data? Many thanks Richard

4 years

3
3
0 0

2024

2023

2022

2021

2020

2019

ceph-users March 2020