We actually have a bunch of bug fixes for snapshot-based mirroring
pending for the next Octopus release. I think this stuck snapshot case
has been fixed, but I'll try to verify on the pacific branch to
ensure.
On Thu, Jan 21, 2021 at 9:11 AM Adam Boyhan <adamb(a)medent.com> wrote:
Decided to request a resync to see the results, I have a very aggressive snapshot mirror
schedule of 5 minutes, replication just keeps starting on the latest snapshot before it
finishes. Pretty sure this would just loop over and over if I don't remove the
schedule.
root@Ccscephtest1:~# rbd snap ls --all CephTestPool1/vm-100-disk-0
SNAPID NAME SIZE PROTECTED TIMESTAMP NAMESPACE
10082
.mirror.primary.90c53c21-6951-4218-9f07-9e983d490993.e0c63479-b09e-4c66-a65b-085b67a19907
2 TiB Thu Jan 21 07:10:09 2021 mirror (primary peer_uuids:[])
10621 TestSnapper1 2 TiB Thu Jan 21 08:15:22 2021 user
10883
.mirror.primary.90c53c21-6951-4218-9f07-9e983d490993.7242f4d1-5203-4273-8b6d-ff4e1411216d
2 TiB Thu Jan 21 08:50:08 2021 mirror (primary peer_uuids:[])
10923
.mirror.primary.90c53c21-6951-4218-9f07-9e983d490993.d0c3c2e7-880b-4e62-90cc-fd501e9a87c9
2 TiB Thu Jan 21 08:55:11 2021 mirror (primary
peer_uuids:[debf975b-ebb8-432c-a94a-d3b101e0f770])
10963
.mirror.primary.90c53c21-6951-4218-9f07-9e983d490993.655f7c17-2f85-42e5-9ffe-777a8a48dda3
2 TiB Thu Jan 21 09:00:09 2021 mirror (primary peer_uuids:[])
10993
.mirror.primary.90c53c21-6951-4218-9f07-9e983d490993.268b960c-51e9-4a60-99b4-c5e7c303fdd8
2 TiB Thu Jan 21 09:05:25 2021 mirror (primary
peer_uuids:[debf975b-ebb8-432c-a94a-d3b101e0f770])
I have removed the 5 minute schedule for now, but I don't think this should be
expected behavior?
From: "adamb" <adamb(a)medent.com>
To: "ceph-users" <ceph-users(a)ceph.io>
Sent: Thursday, January 21, 2021 7:40:01 AM
Subject: [ceph-users] RBD-Mirror Mirror Snapshot stuck
I have a rbd-mirror snapshot on 1 image that failed to replicate and now its not getting
cleaned up.
The cause of this was my fault based on my steps. Just trying to understand how to clean
up/handle the situation.
Here is how I got into this situation.
- Created manual rbd snapshot on the image
- On the remote cluster I cloned the snapshot
- While cloned on the secondary cluster I made the mistake of deleting the snapshot on
the primary
- The subsequent mirror snapshot failed
- I then removed the clone
- The next mirror snapshot was successful but I was left with this mirror snapshot on the
primary that I can't seem to get rid of
root@Ccscephtest1:/var/log/ceph# rbd snap ls --all CephTestPool1/vm-100-disk-0
SNAPID NAME SIZE PROTECTED TIMESTAMP NAMESPACE
10082
.mirror.primary.90c53c21-6951-4218-9f07-9e983d490993.e0c63479-b09e-4c66-a65b-085b67a19907
2 TiB Thu Jan 21 07:10:09 2021 mirror (primary peer_uuids:[])
10243
.mirror.primary.90c53c21-6951-4218-9f07-9e983d490993.483e55aa-2f64-4bb0-ac0f-7b5aac59830e
2 TiB Thu Jan 21 07:30:08 2021 mirror (primary
peer_uuids:[debf975b-ebb8-432c-a94a-d3b101e0f770])
I have tried deleting the snap with "rbd snap rm" like normal user created
snaps, but no luck. Anyway to force the deletion?
_______________________________________________
ceph-users mailing list -- ceph-users(a)ceph.io
To unsubscribe send an email to ceph-users-leave(a)ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users(a)ceph.io
To unsubscribe send an email to ceph-users-leave(a)ceph.io
--
Jason