Dear All

We have the same question here, if anyone can help ... Thank you!

We did not find any documentation about the steps to reset & restart the sync.
Especially the implications of 'bilog trim', 'mdlog trim' and 'datalog trim'.

Our secondary zone is read-only. Both master and secondary zone on Nautilus (master 14.2.9 and secondary 14.2.12).

Can someone also clarify following points? Many thanks in advance!

1) Is it safe to use these 3 commands (bilog trim, mdlog trim, datalog trim) on the master ?

Are the bi logs exclusively used for the sync or are they needed even without multi-site? (mdlog/datalog are obviously only for multi-site)

2) Can we run these 3 commands during the sync or do we need first to stop all instances on the secondary zone ?
In the latter case, do we need to stop the client traffic and wait on md/data sync to catch up prior to stop the secondary zone instances?

3) Can we then restart the instances on the secondary zone and expect rgw sync to run correctly ?

Or do we need first to run 'metadata sync init' and 'data sync init' on the secondary zone ? (to trigger a full sync)

Or is it necessary to delete all rgw pools on the secondary zone ?

4) And regarding the full sync, is it verifying the full object data, or only object size and mtime?

If we update the secondary zone to Nautilus 14.2.18 and enable rgw_sync_obj_etag_verify,

does a full sync will also detect ETag mismatches on objects that are already present on the secondary zone?

Cheers
Francois

From: ceph-users on behalf of Osiński Piotr <Piotr.Osinski@grupawp.pl>
Sent: Saturday, June 22, 2019 11:44 AM
To: ceph-users@lists.ceph.com
Subject: [ceph-users] How to reset and configure replication on multiple RGW servers from scratch?

Hi,

For testing purposes, I configured RGW multisite synchronization between two ceph mimic 13.2.6 clusters (I also tried: 13.2.5).
Now I want to reset all current settings and configure replication from scratch.

Data(pools, buckets) on the master zone will not be deleted.

What has been done:
1) Deleted the secondary zone
# radosgw-admin zone delete --rgw-zone=dc2_zone

2) Removed the secondary zone from zonegroup
# radosgw-admin zonegroup remove --rgw-zonegroup=master_zonegroup --rgw-zone=dc2_zone

3) Commited changes
# radosgw-admin period update --commit

4) Trimmed all datalogs on master zone
# radosgw-admin datalog trim --start-date="2019-06-12 12:01:54" --end-date="2019-06-22 12:01:56"

5) Trimmed all error sync on master zone
# radosgw-admin sync error trim --start-date="2019-06-07 07:19:26" --end-date="2019-06-22 15:59:00"

6) Deleted and recreated empty pools on secondary cluster:
dc2_zone.rgw.control
dc2_zone.rgw.meta
dc2_zone.rgw.log
dc2_zone.rgw.buckets.index
dc2_zone.rgw.buckets.data

Should I clear any other data / metadata in the master zone?
Can data be kept somewhere in the master zone that may affect the new replication statement?

I'm trying to track down a problem with blocked shards synchronization.

Thank you in advance for your help.

Best regards,

Piotr Osiński

<< ATT00001.txt (0.4KB) (0.4KB) >>