rbd images inaccessible for a longer period of time - ceph-users

19 Dec 2019

Hi,

yesterday I had to power off some vm (proxmox) backed by rbd images for maintenance.
After the VMs were off, I tried to create a Snapshot which didn't finish even after
half an hour.
Because it was maintenance I rebooted all VM nodes an all ceph nodes - nothing changed.
Powering on the VM was impossible, kvm exited with timeout.
This happened to two of about 15 VM.

Two of three Images of one VM still had locks, which I did remove but still unable to
power on.
I tried to access the Image by mapping it with rbd-nbd, which was unsuccessful and logged
this:

[ 8601.746971] block nbd0: Connection timed out
[ 8601.747648] block nbd0: shutting down sockets
[ 8601.747653] block nbd0: Connection timed out
[...]
[ 8601.750419] block nbd0: Connection timed out
[ 8601.750831] print_req_error: 121 callbacks suppressed
[ 8601.750832] blk_update_request: I/O error, dev nbd0, sector 0 op 0x0:(READ) flags 0x0
phys_seg 1 prio class 0
[ 8601.751261] buffer_io_error: 182 callbacks suppressed
[ 8601.751262] Buffer I/O error on dev nbd0, logical block 0, async page read
[ 8601.751678] blk_update_request: I/O error, dev nbd0, sector 1 op 0x0:(READ) flags 0x0
phys_seg 1 prio class 0
[...]
[ 8601.760283] ldm_validate_partition_table(): Disk read failed.
[ 8601.760344] Dev nbd0: unable to read RDB block 0
[ 8601.760985]  nbd0: unable to read partition table
[ 8601.761282] nbd0: detected capacity change from 0 to 375809638400
[ 8601.761382] ldm_validate_partition_table(): Disk read failed.
[ 8601.761461] Dev nbd0: unable to read RDB block 0
[ 8601.762145]  nbd0: unable to read partition table

The rbd-nbd process kept existing and had to be killed
Same thing with qemu-nbd.

Exporting the Image via rbd export worked fine, also a rbd copy.
Any other operation on the Image (feature dis / enable) took forever so I had to abort
it.
It seems that every operation leaves a lock on the image.

Because it was in the middle of the night, I stopped working on it.
Today Morning one of the images was accessible again, the others not.

Anybody any hint?
Some system information below.

Regards,
Yves

ceph version 14.2.5 (3ce7517553bdd5195b68a6ffaf0bd7f3acad1647) nautilus (stable)
Primary Cluster with a backup cluster (rbd mirror)

[global]
        auth client required = none
        auth cluster required = none
        auth service required = none
        auth supported = none
        cephx_sign_messages = false
        cephx require signatures = False
        cluster_network = 172.16.230.0/24
        debug asok = 0/0
        debug auth = 0/0
        debug bdev = 0/0
        debug bluefs = 0/0
        debug bluestore = 0/0
        debug buffer = 0/0
        debug civetweb = 0/0
        debug client = 0/0
        debug compressor = 0/0
        debug context = 0/0
        debug crush = 0/0
        debug crypto = 0/0
        debug dpdk = 0/0
        debug eventtrace = 0/0
        debug filer = 0/0
        debug filestore = 0/0
        debug finisher = 0/0
        debug fuse = 0/0
        debug heartbeatmap = 0/0
        debug javaclient = 0/0
        debug journal = 0/0
        debug journaler = 0/0
        debug kinetic = 0/0
        debug kstore = 0/0
        debug leveldb = 0/0
        debug lockdep = 0/0
        debug mds = 0/0
        debug mds balancer = 0/0
        debug mds locker = 0/0
        debug mds log = 0/0
        debug mds log expire = 0/0
        debug mds migrator = 0/0
        debug memdb = 0/0
        debug mgr = 0/0
        debug mgrc = 0/0
        debug mon = 0/0
        debug monc = 0/00
        debug ms = 0/0
        debug none = 0/0
        debug objclass = 0/0
        debug objectcacher = 0/0
        debug objecter = 0/0
        debug optracker = 0/0
        debug osd = 0/0
        debug paxos = 0/0
        debug perfcounter = 0/0
        debug rados = 0/0
        debug rbd = 0/0
        debug rbd mirror = 0/0
        debug rbd replay = 0/0
        debug refs = 0/0
        debug reserver = 0/0
        debug rgw = 0/0
        debug rocksdb = 0/0
        debug striper = 0/0
        debug throttle = 0/0
        debug timer = 0/0
        debug tp = 0/0
        debug xio = 0/0
        fsid = 27fdf1bb-22a1-4d5e-9729-780cbdcd33fe
        mon_allow_pool_delete = true
        mon_host = 172.16.230.142 172.16.230.144 172.16.230.146
        mon_osd_down_out_subtree_limit = host
        osd_backfill_scan_max = 16
        osd_backfill_scan_min = 4
        osd_deep_scrub_interval = 1209600
        osd_journal_size = 5120
        osd_max_backfills = 1
        osd_max_trimming_pgs = 1
        osd_pg_max_concurrent_snap_trims = 1
        osd_pool_default_min_size = 2
        osd_pool_default_size = 3
        osd_recovery_max_active = 1
        osd_recovery_max_single_start = 1
        osd_recovery_op_priority = 1
        osd_recovery_threads = 1
        osd_scrub_begin_hour = 19
        osd_scrub_chunk_max = 1
        osd_scrub_chunk_min = 1
        osd_scrub_during_recovery = false
        osd_scrub_end_hour = 6
        osd_scrub_priority = 1
        osd_scrub_sleep = 0.5
        osd_snap_trim_priority = 1
        osd_snap_trim_sleep = 0.005
        osd_srub_max_interval = 1209600
        public_network = 172.16.230.0/24
        max open files = 131072
        osd objectstore = bluestore
        osd op threads = 2
        osd crush update on start = true

Currently inaccessible image:
rbd image 'vm-29009-disk-2':
        size 200 GiB in 51200 objects
        order 22 (4 MiB objects)
        snapshot_count: 2
        id: 1abd04da8b9a4d
        block_name_prefix: rbd_data.1abd04da8b9a4d
        format: 2
        features: layering, exclusive-lock, object-map, fast-diff, deep-flatten,
journaling
        op_features:
        flags:
        create_timestamp: Tue Jul  9 13:07:36 2019
        access_timestamp: Thu Dec 19 01:35:34 2019
        modify_timestamp: Thu Dec 19 00:19:32 2019
        journal: 1abd04da8b9a4d
        mirroring state: enabled
        mirroring global id: c71ec81f-18be-4d0b-93ed-0cebe3e619bb
        mirroring primary: true