________________________________
From: "Jason Dillaman" <jdillama(a)redhat.com>
To: "adamb" <adamb(a)medent.com>
Cc: "ceph-users" <ceph-users(a)ceph.io>io>, "Matt Wilder"
<matt.wilder(a)bitmex.com>
Sent: Friday, January 22, 2021 3:02:23 PM
Subject: Re: [ceph-users] Re: RBD-Mirror Snapshot Backup Image Uses
Any chance you can attempt to repeat the process on the latest master
or pacific branch clients (no need to upgrade the MONs/OSDs)?
On Fri, Jan 22, 2021 at 2:32 PM Adam Boyhan <adamb(a)medent.com> wrote:
The steps are pretty straight forward.
- Create rbd image of 500G on the primary
- Enable rbd-mirror snapshot on the image
- Map the image on the primary
- Format the block device with ext4
- Mount it and write out 200-300G worth of data (I am using rsync with some local real
data we have)
- Unmap the image from the primary
- Create rdb snapshot
- Create rdb mirror snapshot
- Wait for copy process to complete
- Clone the rdb snapshot on secondary
- Map the image on secondary
- Try to mount on secondary
Just as a reference. All of my nodes are the same.
root@Bunkcephtest1:~# ceph --version
ceph version 15.2.8 (8b89984e92223ec320fb4c70589c39f384c86985) octopus (stable)
root@Bunkcephtest1:~# dpkg -l | grep rbd-mirror
ii rbd-mirror 15.2.8-pve2 amd64 Ceph daemon for mirroring RBD images
This is pretty straight forward, I don't know what I could be missing here.
________________________________
From: "Jason Dillaman" <jdillama(a)redhat.com>
To: "adamb" <adamb(a)medent.com>
Cc: "ceph-users" <ceph-users(a)ceph.io>io>, "Matt Wilder"
<matt.wilder(a)bitmex.com>
Sent: Friday, January 22, 2021 2:11:36 PM
Subject: Re: [ceph-users] Re: RBD-Mirror Snapshot Backup Image Uses
Any chance you could write a small reproducer test script? I can't
repeat what you are seeing and we do have test cases that really
hammer random IO on primary images, create snapshots, rinse-and-repeat
and they haven't turned up anything yet.
Thanks!
On Fri, Jan 22, 2021 at 1:50 PM Adam Boyhan <adamb(a)medent.com> wrote:
>
> I have been doing a lot of testing.
>
> The size of the RBD image doesn't have any effect.
>
> I run into the issue once I actually write data to the rbd. The more data I write
out, the larger the chance of reproducing the issue.
>
> I seem to hit the issue of missing the filesystem all together the most, but I have
also had a few instances where some of the data was simply missing.
>
> I monitor the mirror status on the remote cluster until the snapshot is 100% copied
and also make sure all the IO is done. My setup has no issue maxing out my 10G
interconnect during replication, so its pretty obvious once its done.
>
> The only way I have found to resolve the issue is to call a mirror resync on the
secondary array.
>
> I can then map the rbd on the primary, write more data to it, snap it again, and I
am back in the same position.
>
> ________________________________
> From: "adamb" <adamb(a)medent.com>
> To: "dillaman" <dillaman(a)redhat.com>
> Cc: "ceph-users" <ceph-users(a)ceph.io>io>, "Matt Wilder"
<matt.wilder(a)bitmex.com>
> Sent: Thursday, January 21, 2021 3:11:31 PM
> Subject: [ceph-users] Re: RBD-Mirror Snapshot Backup Image Uses
>
> Sure thing.
>
> root@Bunkcephtest1:~# rbd snap ls --all CephTestPool1/vm-100-disk-1
> SNAPID NAME SIZE PROTECTED TIMESTAMP NAMESPACE
> 12192 TestSnapper1 2 TiB Thu Jan 21 14:15:02 2021 user
> 12595
.mirror.non_primary.a04e92df-3d64-4dc4-8ac8-eaba17b45403.34c4a53e-9525-446c-8de6-409ea93c5edd
2 TiB Thu Jan 21 15:05:02 2021 mirror (non-primary peer_uuids:[]
6c26557e-d011-47b1-8c99-34cf6e0c7f2f:12801 copied)
>
>
> root@Bunkcephtest1:~# rbd mirror image status CephTestPool1/vm-100-disk-1
> vm-100-disk-1:
> global_id: a04e92df-3d64-4dc4-8ac8-eaba17b45403
> state: up+replaying
> description: replaying,
{"bytes_per_second":0.0,"bytes_per_snapshot":0.0,"local_snapshot_timestamp":1611259501,"remote_snapshot_timestamp":1611259501,"replay_state":"idle"}
> service: admin on Bunkcephmon1
> last_update: 2021-01-21 15:06:24
> peer_sites:
> name: ccs
> state: up+stopped
> description: local image is primary
> last_update: 2021-01-21 15:06:23
>
>
> root@Bunkcephtest1:~# rbd clone CephTestPool1/vm-100-disk-1@TestSnapper1
CephTestPool1/vm-100-disk-1-CLONE
> root@Bunkcephtest1:~# rbd-nbd map CephTestPool1/vm-100-disk-1-CLONE
> /dev/nbd0
> root@Bunkcephtest1:~# blkid /dev/nbd0
> root@Bunkcephtest1:~# mount /dev/nbd0 /usr2
> mount: /usr2: wrong fs type, bad option, bad superblock on /dev/nbd0, missing
codepage or helper program, or other error.
>
>
> Primary still looks good.
>
> root@Ccscephtest1:~# rbd clone CephTestPool1/vm-100-disk-1@TestSnapper1
CephTestPool1/vm-100-disk-1-CLONE
> root@Ccscephtest1:~# rbd-nbd map CephTestPool1/vm-100-disk-1-CLONE
> /dev/nbd0
> root@Ccscephtest1:~# blkid /dev/nbd0
> /dev/nbd0: UUID="830b8e05-d5c1-481d-896d-14e21d17017d"
TYPE="ext4"
> root@Ccscephtest1:~# mount /dev/nbd0 /usr2
> root@Ccscephtest1:~# cat /proc/mounts | grep nbd0
> /dev/nbd0 /usr2 ext4 rw,relatime 0 0
>
>
>
>
>
>
> From: "Jason Dillaman" <jdillama(a)redhat.com>
> To: "adamb" <adamb(a)medent.com>
> Cc: "Eugen Block" <eblock(a)nde.ag>ag>, "ceph-users"
<ceph-users(a)ceph.io>io>, "Matt Wilder" <matt.wilder(a)bitmex.com>
> Sent: Thursday, January 21, 2021 3:01:46 PM
> Subject: Re: [ceph-users] Re: RBD-Mirror Snapshot Backup Image Uses
>
> On Thu, Jan 21, 2021 at 11:51 AM Adam Boyhan <adamb(a)medent.com> wrote:
> >
> > I was able to trigger the issue again.
> >
> > - On the primary I created a snap called TestSnapper for disk vm-100-disk-1
> > - Allowed the next RBD-Mirror scheduled snap to complete
> > - At this point the snapshot is showing up on the remote side.
> >
> > root@Bunkcephtest1:~# rbd mirror image status CephTestPool1/vm-100-disk-1
> > vm-100-disk-1:
> > global_id: a04e92df-3d64-4dc4-8ac8-eaba17b45403
> > state: up+replaying
> > description: replaying,
{"bytes_per_second":0.0,"bytes_per_snapshot":0.0,"local_snapshot_timestamp":1611247200,"remote_snapshot_timestamp":1611247200,"replay_state":"idle"}
> > service: admin on Bunkcephmon1
> > last_update: 2021-01-21 11:46:24
> > peer_sites:
> > name: ccs
> > state: up+stopped
> > description: local image is primary
> > last_update: 2021-01-21 11:46:28
> >
> > root@Ccscephtest1:/etc/pve/priv# rbd snap ls --all CephTestPool1/vm-100-disk-1
> > SNAPID NAME SIZE PROTECTED TIMESTAMP NAMESPACE
> > 11532 TestSnapper 2 TiB Thu Jan 21 11:21:25 2021 user
> > 11573
.mirror.primary.a04e92df-3d64-4dc4-8ac8-eaba17b45403.9525e4eb-41c0-499c-8879-0c7d9576e253
2 TiB Thu Jan 21 11:35:00 2021 mirror (primary
peer_uuids:[debf975b-ebb8-432c-a94a-d3b101e0f770])
> >
> > Seems like the sync is complete, So I then clone it, map it and attempt to
mount it.
>
> Can you run "snap ls --all" on the non-primary cluster? The
> non-primary snapshot will list its status. On my cluster (with a much
> smaller image):
>
> #
> # CLUSTER 1
> #
> $ rbd --cluster cluster1 create --size 1G mirror/image1
> $ rbd --cluster cluster1 mirror image enable mirror/image1 snapshot
> Mirroring enabled
> $ rbd --cluster cluster1 device map -t nbd mirror/image1
> /dev/nbd0
> $ mkfs.ext4 /dev/nbd0
> mke2fs 1.45.5 (07-Jan-2020)
> Discarding device blocks: done
> Creating filesystem with 262144 4k blocks and 65536 inodes
> Filesystem UUID: 50e0da12-1f99-4d45-b6e6-5f7a7decaeff
> Superblock backups stored on blocks:
> 32768, 98304, 163840, 229376
>
> Allocating group tables: done
> Writing inode tables: done
> Creating journal (8192 blocks): done
> Writing superblocks and filesystem accounting information: done
> $ blkid /dev/nbd0
> /dev/nbd0: UUID="50e0da12-1f99-4d45-b6e6-5f7a7decaeff"
> BLOCK_SIZE="4096" TYPE="ext4"
> $ rbd --cluster cluster1 snap create mirror/image1@fs
> Creating snap: 100% complete...done.
> $ rbd --cluster cluster1 mirror image snapshot mirror/image1
> Snapshot ID: 6
> $ rbd --cluster cluster1 snap ls --all mirror/image1
> SNAPID NAME
> SIZE PROTECTED TIMESTAMP
> NAMESPACE
> 5 fs
> 1 GiB Thu Jan 21 14:50:24 2021
> user
> 6
.mirror.primary.f9f692b8-2405-416c-9247-5628e303947a.39722e17-f7e6-4050-acf0-3842a5620d81
> 1 GiB Thu Jan 21 14:50:51 2021 mirror (primary
> peer_uuids:[cd643f30-4982-4caf-874d-cf21f6f4b66f])
>
> #
> # CLUSTER 2
> #
>
> $ rbd --cluster cluster2 mirror image status mirror/image1
> image1:
> global_id: f9f692b8-2405-416c-9247-5628e303947a
> state: up+replaying
> description: replaying,
>
{"bytes_per_second":1140872.53,"bytes_per_snapshot":17113088.0,"local_snapshot_timestamp":1611258651,"remote_snapshot_timestamp":1611258651,"replay_state":"idle"}
> service: mirror.0 on cube-1
> last_update: 2021-01-21 14:51:18
> peer_sites:
> name: cluster1
> state: up+stopped
> description: local image is primary
> last_update: 2021-01-21 14:51:27
> $ rbd --cluster cluster2 snap ls --all mirror/image1
> SNAPID NAME
> SIZE PROTECTED TIMESTAMP
> NAMESPACE
> 5 fs
> 1 GiB Thu Jan 21 14:50:52
> 2021 user
> 6
.mirror.non_primary.f9f692b8-2405-416c-9247-5628e303947a.0a13b822-0508-47d6-a460-a8cc4e012686
> 1 GiB Thu Jan 21 14:50:53 2021 mirror (non-primary
> peer_uuids:[] 9824df2b-86c4-4264-a47e-cf968efd09e1:6 copied)
> $ rbd --cluster cluster2 --rbd-default-clone-format 2 clone
> mirror/image1@fs mirror/image2
> $ rbd --cluster cluster2 device map -t nbd mirror/image2
> /dev/nbd1
> $ blkid /dev/nbd1
> /dev/nbd1: UUID="50e0da12-1f99-4d45-b6e6-5f7a7decaeff"
> BLOCK_SIZE="4096" TYPE="ext4"
> $ mount /dev/nbd1 /mnt/
> $ mount | grep nbd
> /dev/nbd1 on /mnt type ext4 (rw,relatime,seclabel)
>
> > root@Bunkcephtest1:~# rbd clone CephTestPool1/vm-100-disk-1@TestSnapper
CephTestPool1/vm-100-disk-1-CLONE
> > root@Bunkcephtest1:~# rbd-nbd map CephTestPool1/vm-100-disk-1-CLONE --id admin
--keyring /etc/ceph/ceph.client.admin.keyring
> > /dev/nbd0
> > root@Bunkcephtest1:~# mount /dev/nbd0 /usr2
> > mount: /usr2: wrong fs type, bad option, bad superblock on /dev/nbd0, missing
codepage or helper program, or other error.
> >
> > On the primary still no issues
> >
> > root@Ccscephtest1:/etc/pve/priv# rbd clone
CephTestPool1/vm-100-disk-1@TestSnapper CephTestPool1/vm-100-disk-1-CLONE
> > root@Ccscephtest1:/etc/pve/priv# rbd-nbd map CephTestPool1/vm-100-disk-1-CLONE
--id admin --keyring /etc/ceph/ceph.client.admin.keyring
> > /dev/nbd0
> > root@Ccscephtest1:/etc/pve/priv# mount /dev/nbd0 /usr2
> >
> >
> >
> >
> >
> > ________________________________
> > From: "Jason Dillaman" <jdillama(a)redhat.com>
> > To: "adamb" <adamb(a)medent.com>
> > Cc: "Eugen Block" <eblock(a)nde.ag>ag>, "ceph-users"
<ceph-users(a)ceph.io>io>, "Matt Wilder" <matt.wilder(a)bitmex.com>
> > Sent: Thursday, January 21, 2021 9:42:26 AM
> > Subject: Re: [ceph-users] Re: RBD-Mirror Snapshot Backup Image Uses
> >
> > On Thu, Jan 21, 2021 at 9:40 AM Adam Boyhan <adamb(a)medent.com> wrote:
> > >
> > > After the resync finished. I can mount it now.
> > >
> > > root@Bunkcephtest1:~# rbd clone CephTestPool1/vm-100-disk-0@TestSnapper1
CephTestPool1/vm-100-disk-0-CLONE
> > > root@Bunkcephtest1:~# rbd-nbd map CephTestPool1/vm-100-disk-0-CLONE --id
admin --keyring /etc/ceph/ceph.client.admin.keyring
> > > /dev/nbd0
> > > root@Bunkcephtest1:~# mount /dev/nbd0 /usr2
> > >
> > > Makes me a bit nervous how it got into that position and everything
appeared ok.
> >
> > We unfortunately need to create the snapshots that are being synced as
> > a first step, but perhaps there are some extra guardrails we can put
> > on the system to prevent premature usage if the sync status doesn't
> > indicate that it's complete.
> >
> > > ________________________________
> > > From: "Jason Dillaman" <jdillama(a)redhat.com>
> > > To: "adamb" <adamb(a)medent.com>
> > > Cc: "Eugen Block" <eblock(a)nde.ag>ag>, "ceph-users"
<ceph-users(a)ceph.io>io>, "Matt Wilder" <matt.wilder(a)bitmex.com>
> > > Sent: Thursday, January 21, 2021 9:25:11 AM
> > > Subject: Re: [ceph-users] Re: RBD-Mirror Snapshot Backup Image Uses
> > >
> > > On Thu, Jan 21, 2021 at 8:34 AM Adam Boyhan <adamb(a)medent.com>
wrote:
> > > >
> > > > When cloning the snapshot on the remote cluster I can't see my
ext4 filesystem.
> > > >
> > > > Using the same exact snapshot on both sides. Shouldn't this be
consistent?
> > >
> > > Yes. Has the replication process completed ("rbd mirror image status
> > > CephTestPool1/vm-100-disk-0")?
> > >
> > > > Primary Site
> > > > root@Ccscephtest1:~# rbd snap ls --all CephTestPool1/vm-100-disk-0 |
grep TestSnapper1
> > > > 10621 TestSnapper1 2 TiB Thu Jan 21 08:15:22 2021 user
> > > >
> > > > root@Ccscephtest1:~# rbd clone
CephTestPool1/vm-100-disk-0@TestSnapper1 CephTestPool1/vm-100-disk-0-CLONE
> > > > root@Ccscephtest1:~# rbd-nbd map CephTestPool1/vm-100-disk-0-CLONE
--id admin --keyring /etc/ceph/ceph.client.admin.keyring
> > > > /dev/nbd0
> > > > root@Ccscephtest1:~# mount /dev/nbd0 /usr2
> > > >
> > > > Secondary Site
> > > > root@Bunkcephtest1:~# rbd snap ls --all CephTestPool1/vm-100-disk-0 |
grep TestSnapper1
> > > > 10430 TestSnapper1 2 TiB Thu Jan 21 08:20:08 2021 user
> > > >
> > > > root@Bunkcephtest1:~# rbd clone
CephTestPool1/vm-100-disk-0@TestSnapper1 CephTestPool1/vm-100-disk-0-CLONE
> > > > root@Bunkcephtest1:~# rbd-nbd map CephTestPool1/vm-100-disk-0-CLONE
--id admin --keyring /etc/ceph/ceph.client.admin.keyring
> > > > /dev/nbd0
> > > > root@Bunkcephtest1:~# mount /dev/nbd0 /usr2
> > > > mount: /usr2: wrong fs type, bad option, bad superblock on /dev/nbd0,
missing codepage or helper program, or other error.
> > > >
> > > >
> > > >
> > > > ________________________________
> > > > From: "adamb" <adamb(a)medent.com>
> > > > To: "dillaman" <dillaman(a)redhat.com>
> > > > Cc: "Eugen Block" <eblock(a)nde.ag>ag>,
"ceph-users" <ceph-users(a)ceph.io>io>, "Matt Wilder"
<matt.wilder(a)bitmex.com>
> > > > Sent: Wednesday, January 20, 2021 3:42:46 PM
> > > > Subject: Re: [ceph-users] Re: RBD-Mirror Snapshot Backup Image Uses
> > > >
> > > > Awesome information. I new I had to be missing something.
> > > >
> > > > All of my clients will be far newer than mimic so I don't think
that will be an issue.
> > > >
> > > > Added the following to my ceph.conf on both clusters.
> > > >
> > > > rbd_default_clone_format = 2
> > > >
> > > > root@Bunkcephmon2:~# rbd clone
CephTestPool1/vm-100-disk-0@TestSnapper1 CephTestPool2/vm-100-disk-0-CLONE
> > > > root@Bunkcephmon2:~# rbd ls CephTestPool2
> > > > vm-100-disk-0-CLONE
> > > >
> > > > I am sure I will be back with more questions. Hoping to replace our
Nimble storage with Ceph and NVMe.
> > > >
> > > > Appreciate it!
> > > >
> > > > ________________________________
> > > > From: "Jason Dillaman" <jdillama(a)redhat.com>
> > > > To: "adamb" <adamb(a)medent.com>
> > > > Cc: "Eugen Block" <eblock(a)nde.ag>ag>,
"ceph-users" <ceph-users(a)ceph.io>io>, "Matt Wilder"
<matt.wilder(a)bitmex.com>
> > > > Sent: Wednesday, January 20, 2021 3:28:39 PM
> > > > Subject: Re: [ceph-users] Re: RBD-Mirror Snapshot Backup Image Uses
> > > >
> > > > On Wed, Jan 20, 2021 at 3:10 PM Adam Boyhan <adamb(a)medent.com>
wrote:
> > > > >
> > > > > That's what I though as well, specially based on this.
> > > > >
> > > > >
> > > > >
> > > > > Note
> > > > >
> > > > > You may clone a snapshot from one pool to an image in another
pool. For example, you may maintain read-only images and snapshots as templates in one
pool, and writeable clones in another pool.
> > > > >
> > > > > root@Bunkcephmon2:~# rbd clone
CephTestPool1/vm-100-disk-0@TestSnapper1 CephTestPool2/vm-100-disk-0-CLONE
> > > > > 2021-01-20T15:06:35.854-0500 7fb889ffb700 -1
librbd::image::CloneRequest: 0x55c7cf8417f0 validate_parent: parent snapshot must be
protected
> > > > >
> > > > > root@Bunkcephmon2:~# rbd snap protect
CephTestPool1/vm-100-disk-0@TestSnapper1
> > > > > rbd: protecting snap failed: (30) Read-only file system
> > > >
> > > > You have two options: (1) protect the snapshot on the primary image
so
> > > > that the protection status replicates or (2) utilize RBD clone v2
> > > > which doesn't require protection but does require Mimic or later
> > > > clients [1].
> > > >
> > > > >
> > > > > From: "Eugen Block" <eblock(a)nde.ag>
> > > > > To: "adamb" <adamb(a)medent.com>
> > > > > Cc: "ceph-users" <ceph-users(a)ceph.io>io>,
"Matt Wilder" <matt.wilder(a)bitmex.com>
> > > > > Sent: Wednesday, January 20, 2021 3:00:54 PM
> > > > > Subject: Re: [ceph-users] Re: RBD-Mirror Snapshot Backup Image
Uses
> > > > >
> > > > > But you should be able to clone the mirrored snapshot on the
remote
> > > > > cluster even though it’s not protected, IIRC.
> > > > >
> > > > >
> > > > > Zitat von Adam Boyhan <adamb(a)medent.com>om>:
> > > > >
> > > > > > Two separate 4 node clusters with 10 OSD's in each
node. Micron 9300
> > > > > > NVMe's are the OSD drives. Heavily based on the
Micron/Supermicro
> > > > > > white papers.
> > > > > >
> > > > > > When I attempt to protect the snapshot on a remote image,
it errors
> > > > > > with read only.
> > > > > >
> > > > > > root@Bunkcephmon2:~# rbd snap protect
> > > > > > CephTestPool1/vm-100-disk-0@TestSnapper1
> > > > > > rbd: protecting snap failed: (30) Read-only file system
> > > > > > _______________________________________________
> > > > > > ceph-users mailing list -- ceph-users(a)ceph.io
> > > > > > To unsubscribe send an email to ceph-users-leave(a)ceph.io
> > > > > _______________________________________________
> > > > > ceph-users mailing list -- ceph-users(a)ceph.io
> > > > > To unsubscribe send an email to ceph-users-leave(a)ceph.io
> > > >
> > > > [1]
https://ceph.io/community/new-mimic-simplified-rbd-image-cloning/
> > > >
> > > > --
> > > > Jason
> > > >
> > >
> > >
> > > --
> > > Jason
> >
> >
> >
> > --
> > Jason
>
>
>
> --
> Jason
> _______________________________________________
> ceph-users mailing list -- ceph-users(a)ceph.io
> To unsubscribe send an email to ceph-users-leave(a)ceph.io
--
Jason