Any chance you could write a small reproducer test script? I can't
repeat what you are seeing and we do have test cases that really
hammer random IO on primary images, create snapshots, rinse-and-repeat
and they haven't turned up anything yet.
Thanks!
On Fri, Jan 22, 2021 at 1:50 PM Adam Boyhan <adamb(a)medent.com> wrote:
I have been doing a lot of testing.
The size of the RBD image doesn't have any effect.
I run into the issue once I actually write data to the rbd. The more data I write out,
the larger the chance of reproducing the issue.
I seem to hit the issue of missing the filesystem all together the most, but I have also
had a few instances where some of the data was simply missing.
I monitor the mirror status on the remote cluster until the snapshot is 100% copied and
also make sure all the IO is done. My setup has no issue maxing out my 10G interconnect
during replication, so its pretty obvious once its done.
The only way I have found to resolve the issue is to call a mirror resync on the
secondary array.
I can then map the rbd on the primary, write more data to it, snap it again, and I am
back in the same position.
________________________________
From: "adamb" <adamb(a)medent.com>
To: "dillaman" <dillaman(a)redhat.com>
Cc: "ceph-users" <ceph-users(a)ceph.io>io>, "Matt Wilder"
<matt.wilder(a)bitmex.com>
Sent: Thursday, January 21, 2021 3:11:31 PM
Subject: [ceph-users] Re: RBD-Mirror Snapshot Backup Image Uses
Sure thing.
root@Bunkcephtest1:~# rbd snap ls --all CephTestPool1/vm-100-disk-1
SNAPID NAME SIZE PROTECTED TIMESTAMP NAMESPACE
12192 TestSnapper1 2 TiB Thu Jan 21 14:15:02 2021 user
12595
.mirror.non_primary.a04e92df-3d64-4dc4-8ac8-eaba17b45403.34c4a53e-9525-446c-8de6-409ea93c5edd
2 TiB Thu Jan 21 15:05:02 2021 mirror (non-primary peer_uuids:[]
6c26557e-d011-47b1-8c99-34cf6e0c7f2f:12801 copied)
root@Bunkcephtest1:~# rbd mirror image status CephTestPool1/vm-100-disk-1
vm-100-disk-1:
global_id: a04e92df-3d64-4dc4-8ac8-eaba17b45403
state: up+replaying
description: replaying,
{"bytes_per_second":0.0,"bytes_per_snapshot":0.0,"local_snapshot_timestamp":1611259501,"remote_snapshot_timestamp":1611259501,"replay_state":"idle"}
service: admin on Bunkcephmon1
last_update: 2021-01-21 15:06:24
peer_sites:
name: ccs
state: up+stopped
description: local image is primary
last_update: 2021-01-21 15:06:23
root@Bunkcephtest1:~# rbd clone CephTestPool1/vm-100-disk-1@TestSnapper1
CephTestPool1/vm-100-disk-1-CLONE
root@Bunkcephtest1:~# rbd-nbd map CephTestPool1/vm-100-disk-1-CLONE
/dev/nbd0
root@Bunkcephtest1:~# blkid /dev/nbd0
root@Bunkcephtest1:~# mount /dev/nbd0 /usr2
mount: /usr2: wrong fs type, bad option, bad superblock on /dev/nbd0, missing codepage or
helper program, or other error.
Primary still looks good.
root@Ccscephtest1:~# rbd clone CephTestPool1/vm-100-disk-1@TestSnapper1
CephTestPool1/vm-100-disk-1-CLONE
root@Ccscephtest1:~# rbd-nbd map CephTestPool1/vm-100-disk-1-CLONE
/dev/nbd0
root@Ccscephtest1:~# blkid /dev/nbd0
/dev/nbd0: UUID="830b8e05-d5c1-481d-896d-14e21d17017d" TYPE="ext4"
root@Ccscephtest1:~# mount /dev/nbd0 /usr2
root@Ccscephtest1:~# cat /proc/mounts | grep nbd0
/dev/nbd0 /usr2 ext4 rw,relatime 0 0
From: "Jason Dillaman" <jdillama(a)redhat.com>
To: "adamb" <adamb(a)medent.com>
Cc: "Eugen Block" <eblock(a)nde.ag>ag>, "ceph-users"
<ceph-users(a)ceph.io>io>, "Matt Wilder" <matt.wilder(a)bitmex.com>
Sent: Thursday, January 21, 2021 3:01:46 PM
Subject: Re: [ceph-users] Re: RBD-Mirror Snapshot Backup Image Uses
On Thu, Jan 21, 2021 at 11:51 AM Adam Boyhan <adamb(a)medent.com> wrote:
I was able to trigger the issue again.
- On the primary I created a snap called TestSnapper for disk vm-100-disk-1
- Allowed the next RBD-Mirror scheduled snap to complete
- At this point the snapshot is showing up on the remote side.
root@Bunkcephtest1:~# rbd mirror image status CephTestPool1/vm-100-disk-1
vm-100-disk-1:
global_id: a04e92df-3d64-4dc4-8ac8-eaba17b45403
state: up+replaying
description: replaying,
{"bytes_per_second":0.0,"bytes_per_snapshot":0.0,"local_snapshot_timestamp":1611247200,"remote_snapshot_timestamp":1611247200,"replay_state":"idle"}
service: admin on Bunkcephmon1
last_update: 2021-01-21 11:46:24
peer_sites:
name: ccs
state: up+stopped
description: local image is primary
last_update: 2021-01-21 11:46:28
root@Ccscephtest1:/etc/pve/priv# rbd snap ls --all CephTestPool1/vm-100-disk-1
SNAPID NAME SIZE PROTECTED TIMESTAMP NAMESPACE
11532 TestSnapper 2 TiB Thu Jan 21 11:21:25 2021 user
11573
.mirror.primary.a04e92df-3d64-4dc4-8ac8-eaba17b45403.9525e4eb-41c0-499c-8879-0c7d9576e253
2 TiB Thu Jan 21 11:35:00 2021 mirror (primary
peer_uuids:[debf975b-ebb8-432c-a94a-d3b101e0f770])
Seems like the sync is complete, So I then clone it, map it and attempt to mount it.
Can you run "snap ls --all" on the non-primary cluster? The
non-primary snapshot will list its status. On my cluster (with a much
smaller image):
#
# CLUSTER 1
#
$ rbd --cluster cluster1 create --size 1G mirror/image1
$ rbd --cluster cluster1 mirror image enable mirror/image1 snapshot
Mirroring enabled
$ rbd --cluster cluster1 device map -t nbd mirror/image1
/dev/nbd0
$ mkfs.ext4 /dev/nbd0
mke2fs 1.45.5 (07-Jan-2020)
Discarding device blocks: done
Creating filesystem with 262144 4k blocks and 65536 inodes
Filesystem UUID: 50e0da12-1f99-4d45-b6e6-5f7a7decaeff
Superblock backups stored on blocks:
32768, 98304, 163840, 229376
Allocating group tables: done
Writing inode tables: done
Creating journal (8192 blocks): done
Writing superblocks and filesystem accounting information: done
$ blkid /dev/nbd0
/dev/nbd0: UUID="50e0da12-1f99-4d45-b6e6-5f7a7decaeff"
BLOCK_SIZE="4096" TYPE="ext4"
$ rbd --cluster cluster1 snap create mirror/image1@fs
Creating snap: 100% complete...done.
$ rbd --cluster cluster1 mirror image snapshot mirror/image1
Snapshot ID: 6
$ rbd --cluster cluster1 snap ls --all mirror/image1
SNAPID NAME
SIZE PROTECTED TIMESTAMP
NAMESPACE
5 fs
1 GiB Thu Jan 21 14:50:24 2021
user
6
.mirror.primary.f9f692b8-2405-416c-9247-5628e303947a.39722e17-f7e6-4050-acf0-3842a5620d81
1 GiB Thu Jan 21 14:50:51 2021 mirror (primary
peer_uuids:[cd643f30-4982-4caf-874d-cf21f6f4b66f])
#
# CLUSTER 2
#
$ rbd --cluster cluster2 mirror image status mirror/image1
image1:
global_id: f9f692b8-2405-416c-9247-5628e303947a
state: up+replaying
description: replaying,
{"bytes_per_second":1140872.53,"bytes_per_snapshot":17113088.0,"local_snapshot_timestamp":1611258651,"remote_snapshot_timestamp":1611258651,"replay_state":"idle"}
service: mirror.0 on cube-1
last_update: 2021-01-21 14:51:18
peer_sites:
name: cluster1
state: up+stopped
description: local image is primary
last_update: 2021-01-21 14:51:27
$ rbd --cluster cluster2 snap ls --all mirror/image1
SNAPID NAME
SIZE PROTECTED TIMESTAMP
NAMESPACE
5 fs
1 GiB Thu Jan 21 14:50:52
2021 user
6
.mirror.non_primary.f9f692b8-2405-416c-9247-5628e303947a.0a13b822-0508-47d6-a460-a8cc4e012686
1 GiB Thu Jan 21 14:50:53 2021 mirror (non-primary
peer_uuids:[] 9824df2b-86c4-4264-a47e-cf968efd09e1:6 copied)
$ rbd --cluster cluster2 --rbd-default-clone-format 2 clone
mirror/image1@fs mirror/image2
$ rbd --cluster cluster2 device map -t nbd mirror/image2
/dev/nbd1
$ blkid /dev/nbd1
/dev/nbd1: UUID="50e0da12-1f99-4d45-b6e6-5f7a7decaeff"
BLOCK_SIZE="4096" TYPE="ext4"
$ mount /dev/nbd1 /mnt/
$ mount | grep nbd
/dev/nbd1 on /mnt type ext4 (rw,relatime,seclabel)
root@Bunkcephtest1:~# rbd clone
CephTestPool1/vm-100-disk-1@TestSnapper CephTestPool1/vm-100-disk-1-CLONE
root@Bunkcephtest1:~# rbd-nbd map CephTestPool1/vm-100-disk-1-CLONE --id admin --keyring
/etc/ceph/ceph.client.admin.keyring
/dev/nbd0
root@Bunkcephtest1:~# mount /dev/nbd0 /usr2
mount: /usr2: wrong fs type, bad option, bad superblock on /dev/nbd0, missing codepage or
helper program, or other error.
On the primary still no issues
root@Ccscephtest1:/etc/pve/priv# rbd clone CephTestPool1/vm-100-disk-1@TestSnapper
CephTestPool1/vm-100-disk-1-CLONE
root@Ccscephtest1:/etc/pve/priv# rbd-nbd map CephTestPool1/vm-100-disk-1-CLONE --id admin
--keyring /etc/ceph/ceph.client.admin.keyring
/dev/nbd0
root@Ccscephtest1:/etc/pve/priv# mount /dev/nbd0 /usr2
________________________________
From: "Jason Dillaman" <jdillama(a)redhat.com>
To: "adamb" <adamb(a)medent.com>
Cc: "Eugen Block" <eblock(a)nde.ag>ag>, "ceph-users"
<ceph-users(a)ceph.io>io>, "Matt Wilder" <matt.wilder(a)bitmex.com>
Sent: Thursday, January 21, 2021 9:42:26 AM
Subject: Re: [ceph-users] Re: RBD-Mirror Snapshot Backup Image Uses
On Thu, Jan 21, 2021 at 9:40 AM Adam Boyhan <adamb(a)medent.com> wrote:
After the resync finished. I can mount it now.
root@Bunkcephtest1:~# rbd clone CephTestPool1/vm-100-disk-0@TestSnapper1
CephTestPool1/vm-100-disk-0-CLONE
root@Bunkcephtest1:~# rbd-nbd map CephTestPool1/vm-100-disk-0-CLONE --id admin --keyring
/etc/ceph/ceph.client.admin.keyring
/dev/nbd0
root@Bunkcephtest1:~# mount /dev/nbd0 /usr2
Makes me a bit nervous how it got into that position and everything appeared ok.
We unfortunately need to create the snapshots that are being synced as
a first step, but perhaps there are some extra guardrails we can put
on the system to prevent premature usage if the sync status doesn't
indicate that it's complete.
________________________________
From: "Jason Dillaman" <jdillama(a)redhat.com>
To: "adamb" <adamb(a)medent.com>
Cc: "Eugen Block" <eblock(a)nde.ag>ag>, "ceph-users"
<ceph-users(a)ceph.io>io>, "Matt Wilder" <matt.wilder(a)bitmex.com>
Sent: Thursday, January 21, 2021 9:25:11 AM
Subject: Re: [ceph-users] Re: RBD-Mirror Snapshot Backup Image Uses
On Thu, Jan 21, 2021 at 8:34 AM Adam Boyhan <adamb(a)medent.com> wrote:
When cloning the snapshot on the remote cluster I can't see my ext4 filesystem.
Using the same exact snapshot on both sides. Shouldn't this be consistent?
Yes. Has the replication process completed ("rbd mirror image status
CephTestPool1/vm-100-disk-0")?
Primary Site
root@Ccscephtest1:~# rbd snap ls --all CephTestPool1/vm-100-disk-0 | grep TestSnapper1
10621 TestSnapper1 2 TiB Thu Jan 21 08:15:22 2021 user
root@Ccscephtest1:~# rbd clone CephTestPool1/vm-100-disk-0@TestSnapper1
CephTestPool1/vm-100-disk-0-CLONE
root@Ccscephtest1:~# rbd-nbd map CephTestPool1/vm-100-disk-0-CLONE --id admin --keyring
/etc/ceph/ceph.client.admin.keyring
/dev/nbd0
root@Ccscephtest1:~# mount /dev/nbd0 /usr2
Secondary Site
root@Bunkcephtest1:~# rbd snap ls --all CephTestPool1/vm-100-disk-0 | grep TestSnapper1
10430 TestSnapper1 2 TiB Thu Jan 21 08:20:08 2021 user
root@Bunkcephtest1:~# rbd clone CephTestPool1/vm-100-disk-0@TestSnapper1
CephTestPool1/vm-100-disk-0-CLONE
root@Bunkcephtest1:~# rbd-nbd map CephTestPool1/vm-100-disk-0-CLONE --id admin --keyring
/etc/ceph/ceph.client.admin.keyring
/dev/nbd0
root@Bunkcephtest1:~# mount /dev/nbd0 /usr2
mount: /usr2: wrong fs type, bad option, bad superblock on /dev/nbd0, missing codepage or
helper program, or other error.
________________________________
From: "adamb" <adamb(a)medent.com>
To: "dillaman" <dillaman(a)redhat.com>
Cc: "Eugen Block" <eblock(a)nde.ag>ag>, "ceph-users"
<ceph-users(a)ceph.io>io>, "Matt Wilder" <matt.wilder(a)bitmex.com>
Sent: Wednesday, January 20, 2021 3:42:46 PM
Subject: Re: [ceph-users] Re: RBD-Mirror Snapshot Backup Image Uses
Awesome information. I new I had to be missing something.
All of my clients will be far newer than mimic so I don't think that will be an
issue.
Added the following to my ceph.conf on both clusters.
rbd_default_clone_format = 2
root@Bunkcephmon2:~# rbd clone CephTestPool1/vm-100-disk-0@TestSnapper1
CephTestPool2/vm-100-disk-0-CLONE
root@Bunkcephmon2:~# rbd ls CephTestPool2
vm-100-disk-0-CLONE
I am sure I will be back with more questions. Hoping to replace our Nimble storage with
Ceph and NVMe.
Appreciate it!
________________________________
From: "Jason Dillaman" <jdillama(a)redhat.com>
To: "adamb" <adamb(a)medent.com>
Cc: "Eugen Block" <eblock(a)nde.ag>ag>, "ceph-users"
<ceph-users(a)ceph.io>io>, "Matt Wilder" <matt.wilder(a)bitmex.com>
Sent: Wednesday, January 20, 2021 3:28:39 PM
Subject: Re: [ceph-users] Re: RBD-Mirror Snapshot Backup Image Uses
On Wed, Jan 20, 2021 at 3:10 PM Adam Boyhan <adamb(a)medent.com> wrote:
>
> That's what I though as well, specially based on this.
>
>
>
> Note
>
> You may clone a snapshot from one pool to an image in another pool. For example, you
may maintain read-only images and snapshots as templates in one pool, and writeable clones
in another pool.
>
> root@Bunkcephmon2:~# rbd clone CephTestPool1/vm-100-disk-0@TestSnapper1
CephTestPool2/vm-100-disk-0-CLONE
> 2021-01-20T15:06:35.854-0500 7fb889ffb700 -1 librbd::image::CloneRequest:
0x55c7cf8417f0 validate_parent: parent snapshot must be protected
>
> root@Bunkcephmon2:~# rbd snap protect CephTestPool1/vm-100-disk-0@TestSnapper1
> rbd: protecting snap failed: (30) Read-only file system
You have two options: (1) protect the snapshot on the primary image so
that the protection status replicates or (2) utilize RBD clone v2
which doesn't require protection but does require Mimic or later
clients [1].
>
> From: "Eugen Block" <eblock(a)nde.ag>
> To: "adamb" <adamb(a)medent.com>
> Cc: "ceph-users" <ceph-users(a)ceph.io>io>, "Matt Wilder"
<matt.wilder(a)bitmex.com>
> Sent: Wednesday, January 20, 2021 3:00:54 PM
> Subject: Re: [ceph-users] Re: RBD-Mirror Snapshot Backup Image Uses
>
> But you should be able to clone the mirrored snapshot on the remote
> cluster even though it’s not protected, IIRC.
>
>
> Zitat von Adam Boyhan <adamb(a)medent.com>om>:
>
> > Two separate 4 node clusters with 10 OSD's in each node. Micron 9300
> > NVMe's are the OSD drives. Heavily based on the Micron/Supermicro
> > white papers.
> >
> > When I attempt to protect the snapshot on a remote image, it errors
> > with read only.
> >
> > root@Bunkcephmon2:~# rbd snap protect
> > CephTestPool1/vm-100-disk-0@TestSnapper1
> > rbd: protecting snap failed: (30) Read-only file system
> > _______________________________________________
> > ceph-users mailing list -- ceph-users(a)ceph.io
> > To unsubscribe send an email to ceph-users-leave(a)ceph.io
> _______________________________________________
> ceph-users mailing list -- ceph-users(a)ceph.io
> To unsubscribe send an email to ceph-users-leave(a)ceph.io
[1]
https://ceph.io/community/new-mimic-simplified-rbd-image-cloning/
--
Jason
--
Jason
--
Jason
--
Jason
_______________________________________________
ceph-users mailing list -- ceph-users(a)ceph.io
To unsubscribe send an email to ceph-users-leave(a)ceph.io