Hello,
My goal is to setup multisite RGW with 2 separate CEPH clusters in separate datacenters, where RGW data are being replicated. I created a lab for this purpose in both locations (with latest reef ceph installed using cephadm) and tried to follow this guide: https://docs.ceph.com/en/reef/radosgw/multisite/
Unfortunatelly, even after multiple attempts it always failed when creating a secondary zone. I could succesfully pull the realm from master, but that was pretty much the last trully succesful step. I can notice that immediately after pulling the realm to secondary, radosgw-admin user list returns an empty list (which IMHO should contain replicated user list from master). Continuing by setting default real and zonegroup and creating the secondary zone in secondary cluster I end up having 2 zones in each cluster, both seemingly in same zonegroup, but with replication failing - this is what I see in sync status:
(master) [ceph: root@ceph-lab-brn-01 /]# radosgw-admin sync status
realm d2c4ebf9-e156-4c4e-9d56-3fff6a652e75 (ceph)
zonegroup abc3c0ae-a84d-48d4-8e78-da251eb78781 (cz)
zone 97fb5842-713a-4995-8966-5afe1384f17f (cz-brn)
current time 2023-08-30T12:58:12Z
zonegroup features enabled: resharding
disabled: compress-encrypted
metadata sync no sync (zone is master)
2023-08-30T12:58:13.991+0000 7f583a52c780 0 ERROR: failed to fetch datalog info
data sync source: 13a8c663-b241-4d8a-a424-8785fc539ec5 (cz-hol)
failed to retrieve sync info: (13) Permission denied
(secondary) [ceph: root@ceph-lab-hol-01 /]# radosgw-admin sync status
realm d2c4ebf9-e156-4c4e-9d56-3fff6a652e75 (ceph)
zonegroup abc3c0ae-a84d-48d4-8e78-da251eb78781 (cz)
zone 13a8c663-b241-4d8a-a424-8785fc539ec5 (cz-hol)
current time 2023-08-30T12:58:54Z
zonegroup features enabled: resharding
disabled: compress-encrypted
metadata sync failed to read sync status: (2) No such file or directory
2023-08-30T12:58:55.617+0000 7ff37c9db780 0 ERROR: failed to fetch datalog info
data sync source: 97fb5842-713a-4995-8966-5afe1384f17f (cz-brn)
failed to retrieve sync info: (13) Permission denied
In master there is one user created during the process (synchronization-user), on slave there are no users and when I try to re-create this synchronization user it complains I shouldn't even try and instead execute the command on master. I can see same realm and zonegroup IDs on both sides, zone list is different though:
(master) [ceph: root@ceph-lab-brn-01 /]# radosgw-admin zone list
{
"default_info": "97fb5842-713a-4995-8966-5afe1384f17f",
"zones": [
"cz-brn",
"default"
]
}
(secondary) [ceph: root@ceph-lab-hol-01 /]# radosgw-admin zone list
{
"default_info": "13a8c663-b241-4d8a-a424-8785fc539ec5",
"zones": [
"cz-hol",
"default"
]
}
The permission denied error is puzzling me - could it be because real pull didn't sync the users? I tried this multiple times with clean ceph install on both sides - and always ended up the same. I even tried force creating the same user with same secrets on the other side, but it didn't help. How can I debug what kind of secret is secondary trying to use when communicating with master? Could it be that this multisite RGW setup is not yet truly supported in reef? I noticed that the documentation itself seems written for older ceph versions, as there are no mentions about orchestrator (for example in steps where configuration files of RGW need to be edited, which is done differently when using cephadm).
I think that documentation is simply wrong at this time. Either it's missing some crucial steps, or it's outdated or otherwise unclear - simply by following all the steps as outlined there, you are likely to end up the same.
Thanks for help!
Hello,
Finish v18.2.0 upgrade on LRC? It seems to be running v18.1.3
not much of a difference in code commits
news on teuthology jobs hanging?
cephfs issues because of network troubles
Its resolved by Patrick
User council discussion follow-up
Detailed info on this pad: https://pad.ceph.com/p/user_dev_relaunch
First topic will come from David's team
16.2.14 release
Pushing to release by this week.
Regards,
Nizam
--
Nizamudeen A
Software Engineer
Red Hat <https://www.redhat.com/>
<https://www.redhat.com/>
Hello,
Is there a way to somehow fine tune the rebalance even further than basic tuning steps when adding new osds?
Today I've added some osd to the index pool and it generated many slow ops due to OSD op latency increase + read operation latency increase = high put get latency.
https://ibb.co/album/9mN6GQ
osd max backfill, max recovery, recovery ops priority are 1.
1 nvme drive has 4 osd, each osd has around 80pg.
The steps how I add the osds:
1. Set norebalance
2. add the osds
3. wait for peering
4. unset rebalance
It takes like 15-20 mins to became normal without interrupting the rebalance the user traffic.
Thank you,
Istvan
________________________________
This message is confidential and is for the sole use of the intended recipient(s). It may also be privileged or otherwise protected by copyright or other legal rules. If you have received it by mistake please let us know by reply email and delete it from your system. It is prohibited to copy this message or disclose its content to anyone. Any confidentiality or privilege is not waived or lost by any mistaken delivery or unauthorized disclosure of the message. All messages sent to and from Agoda may be monitored to ensure compliance with company policies, to protect the company's interests and to remove potential malware. Electronic messages may be intercepted, amended, lost or deleted, or contain viruses.
Dear listers,
my employer already has a production Ceph cluster running but we need a
second one. I just wanted to ask your opininion on the following setup.
It is planned for 500 TB net capacity, expandable to 2 PB. I expect the
number of OSD servers to double in the next 4 years. Erasure Code 3:2
will be used for OSDs. Usage will be file storage, Rados block devices
and S3:
5x OSD servers (12x18 TB Toshiba MG09SCA18TE SAS spinning disks for
data, 2x512 GB Samsung PM9A1 M.2 NVME SSD 0,55 DWPD for system, 1xAMD
7313P CPU with 16 cores @3GHz, 256 GB RAM, LSI SAS 9500 HBA, Broadcom
P425G network adapter 4x25 Gbit/s)
3x MON servers (1x2 TB Samsung PM9A1 M.2 NVME SSD 0,55 DWPD for system,
2x1.6TB Kioxia CD6-V SSD 3.0 DWPD for data, 2x Broadcom P210/N210
network 4x10 GBit/s, 1xAMD 7232P CPU with 8 cores @3.1 GHz, 64 GB RAM)
3x MDS servers (1x2 TB Samsung PM9A1 M.2 NVME SSD 0,55 DWPD for system,
2x1.6 TB Kioxia CD6-V SSD 3.0 DWPD for data, 2x Broadcom P210/N210
network 4x10 GBit/s, 1xAMD 7313P CPU with 16 cores @3 GHz, 128 GB RAM)
OSD servers will be connected via 2x25 GBit fibre interfaces "backend" to
2x Mikrotik CRS518-16XS-2XQ (which are connected for high-availability
via 100 GBit)
For the "frontend" connection to servers/clients via 2x10 GBit we're
looking into
3x Mikrotik CRS326-24S+2Q+RM (which are connected for high-availability
via 40 GBit)
Especially for the "frontend" switches i'm looking for alternatives.
Currently we use Huawei C6810-32T16A4Q-LI models with 2x33 LACP
connections connected via 10 GBit/s RJ45. But those had ports blocking
after a number of errors which resulted in some trouble. We'd like to
avoid IOS and clones in general and would prefer a decent web interface.
Any comments/recommendations?
Best regards,
Kai
We upgraded to Reef from Quincy, all went smoothly (thanks Ceph developers!)
When adding OSDs, the process seems to have changed, the docs no longer
mention OSD spec, and giving it a try it fails when it bumps into the root
drive (which has an active LVM). I expect I can add a filter to avoid it.
But is using the OSD spec (
https://docs.ceph.com/en/octopus/cephadm/drivegroups/) approach now
deprecated? Is the web-interface now the preferred way?
thanks.