[ceph-users] Re: RBD Image can't be formatted - blk_error

8 Jan 2021

On Fri, Jan 8, 2021 at 2:19 PM Gaël THEROND &lt;gael.therond(a)bitswalk.com&gt; wrote:
...

 Hi everyone!

 I'm facing a weird issue with one of my CEPH clusters:

 OS: CentOS - 8.2.2004 (Core)
 CEPH: Nautilus 14.2.11 - stable
 RBD using erasure code profile (K=3; m=2)

 When I want to format one of my RBD image (client side) I've got the
 following kernel messages multiple time with different sector IDs:

 *[2417011.790154] blk_update_request: I/O error, dev rbd23, sector
 164743869184 op 0x3:(DISCARD) flags 0x4000 phys_seg 1 prio class
 0[2417011.791404] rbd: rbd23: discard at objno 20110336 2490368~1703936
 result -1  *

 At first I thought about a faulty disk BUT the monitoring system is not
 showing anything faulty so I decided to run manual tests on all my OSDs to
 look at disk health using smartctl etc.

 None of them is marked as not healthy and actually they don't get any
 counter with faulty sectors/read or writes and the Wear Level is 99%

 So, the only particularity of this image is it is a 80Tb image, but it
 shouldn't be an issue as we already have that kind of image size used on
 another pool.

 If anyone have a clue at how I could sort this out, I'll be more than happy 
Hi Gaël,

What command are you running to format the image?

Is it persistent?  After the first formatting attempt fails, do the
following attempts fail too?

Is it always the same set of sectors?

Could you please attach the output of "rbd info" for that image and the
entire kernel log from the time that image is mapped?

Thanks,

                Ilya

2024

2023

2022

2021

2020

2019

[ceph-users] Re: RBD Image can't be formatted - blk_error