[ceph-users] Re: ceph pgs inconsistent, always the same checksum

16 Sep 2020

Hi David,

the morning's log is fine for now. The full log is preferred unless it's 
too large. If that's the case then please take 20000 lines prior to the 
read failure. Just in case please also check whether additional 
occurrences for "_verify_csum bad" are present before this snippet.

Once you face the issue again please run deep fsck - I'd like to make 
sure if these checksum failures are persistent or not. The latter is a 
strong symptom of https://tracker.ceph.com/issues/22464

Thanks,

Igor

On 9/16/2020 2:36 AM, David Orman wrote:
...
  Yes, we can do this (log/deep fsck) on the next
incident. I have not 
 repaired the currently inconsistent PG, would you like me to do the 
 fsck on it now, prior to the PG repair? PG repair is what we've been 
 using. Unfortunately, I did not see this until now, and we had a 
 failure in the morning, so I am not sure how useful the logs will be 
 to you. Are you just looking for the entries related to the failure 
 such as the ones posted in the initial message, or the full OSD log? 
 If you've got a preferred method for me to capture what you want - let 
 me know. We are using cephadm/podman container operated ceph.

 We will update with logs when the next failure happens.

 On Tue, Sep 15, 2020 at 4:05 AM Igor Fedotov &lt;ifedotov(a)suse.de 
 <mailto:ifedotov@suse.de>> wrote:

     Hi Welby,

     could you share an OSD log containing such errors then, please?

     Also - David mentioned 'repair' which fixes the issue - is it
     bluestore repair or PG  one?

     If the latter could you please try bluestore deep fsck (via
     'ceph-bluestore-tool --command fsck --deep 1') immediately after
     the failure has been discovered?

     Will it succeed?

     Thanks,

     Igor

     On 9/14/2020 8:45 PM, Welby McRoberts wrote:
>     Hi Igor
>
>     We'll take a look at disabling swap on the nodes and see if that
>     improves the situation.
>
>     Having checked across all osds we're not seeing
>     bluestore_reads_with_retries as anything other than a zero value.
>     We get the error anywhere from 3 - 10 occurrences of the error a
>     week, but it's usually only one or two PGs that are inconsistent
>     at any one time.
>
>     Thanks
>     Welby
>
>     On Mon, Sep 14, 2020 at 12:17 PM Igor Fedotov &lt;ifedotov(a)suse.de
>     <mailto:ifedotov@suse.de>> wrote:
>
>         Hi David,
>
>         you might want to try to disable swap for your nodes. Look
>         like there is
>         some implicit correlation between such read errors and
>         enabled swapping.
>
>         Also wondering whether you can observe non-zero values for
>         "bluestore_reads_with_retries" performance counters over your
>         OSDs. How
>         wide-spread these cases are present? How high this counter
>         might get?
>
>
>         Thanks,
>
>         Igor
>
>
>         On 9/9/2020 4:59 PM, David Orman wrote:
>         > Right, you can see the previously referenced ticket/bug in
>         the link I had
>         > provided. It's definitely not an unknown situation.
>         >
>         > We have another one today:
>         >
>         > debug 2020-09-09T06:49:36.595+0000 7f570871d700 -1
>         > bluestore(/var/lib/ceph/osd/ceph-123) _verify_csum bad
>         crc32c/0x1000
>         > checksum at blob offset 0x60000, got 0x6706be76, expected
>         0x929a618, device
>         > location [0x2f387d70000~1000], logical extent 0xe0000~1000,
>         object
>         > 0#2:7ff493bc:::rbd_data.3.20d195d612942.0000000004228a96:head#
>         >
>         > debug 2020-09-09T06:49:36.611+0000 7f570871d700 -1
>         > bluestore(/var/lib/ceph/osd/ceph-123) _verify_csum bad
>         crc32c/0x1000
>         > checksum at blob offset 0x60000, got 0x6706be76, expected
>         0x929a618, device
>         > location [0x2f387d70000~1000], logical extent 0xe0000~1000,
>         object
>         > 0#2:7ff493bc:::rbd_data.3.20d195d612942.0000000004228a96:head#
>         >
>         > debug 2020-09-09T06:49:36.611+0000 7f570871d700 -1
>         > bluestore(/var/lib/ceph/osd/ceph-123) _verify_csum bad
>         crc32c/0x1000
>         > checksum at blob offset 0x60000, got 0x6706be76, expected
>         0x929a618, device
>         > location [0x2f387d70000~1000], logical extent 0xe0000~1000,
>         object
>         > 0#2:7ff493bc:::rbd_data.3.20d195d612942.0000000004228a96:head#
>         >
>         > debug 2020-09-09T06:49:36.611+0000 7f570871d700 -1
>         > bluestore(/var/lib/ceph/osd/ceph-123) _verify_csum bad
>         crc32c/0x1000
>         > checksum at blob offset 0x60000, got 0x6706be76, expected
>         0x929a618, device
>         > location [0x2f387d70000~1000], logical extent 0xe0000~1000,
>         object
>         > 0#2:7ff493bc:::rbd_data.3.20d195d612942.0000000004228a96:head#
>         >
>         > debug 2020-09-09T06:49:37.315+0000 7f570871d700 -1
>         log_channel(cluster) log
>         > [ERR] : 2.3fe shard 123(0) soid
>         > 2:7ff493bc:::rbd_data.3.20d195d612942.0000000004228a96:head
>         : candidate had
>         > a read error
>         >
>         > debug 2020-09-09T06:57:08.930+0000 7f570871d700 -1
>         log_channel(cluster) log
>         > [ERR] : 2.3fes0 deep-scrub 0 missing, 1 inconsistent objects
>         >
>         > debug 2020-09-09T06:57:08.930+0000 7f570871d700 -1
>         log_channel(cluster) log
>         > [ERR] : 2.3fe deep-scrub 1 errors
>         >
>         > This happens across the entire cluster, not just one
>         server, so we don't
>         > think it's faulty hardware.
>         >
>         > On Wed, Sep 9, 2020 at 12:51 AM Janne Johansson
>         &lt;icepic.dz(a)gmail.com <mailto:icepic.dz@gmail.com>> wrote:
>         >
>         >> I googled "got 0x6706be76, expected" and found some hits
>         regarding ceph,
>         >> so whatever it is, you are not the first, and that number
>         has some internal
>         >> meaning.
>         >> Redhat solution for similar issue says that checksum is
>         for seeing all
>         >> zeroes, and hints at a bad write cache on the controller
>         or something that
>         >> ends up clearing data instead of writing the correct
>         information on
>         >> shutdowns.
>         >>
>         >>
>         >> Den tis 8 sep. 2020 kl 23:21 skrev David Orman
>         &lt;ormandj(a)corenode.com <mailto:ormandj@corenode.com>>:
>         >>
>         >>>
>         >>> We're seeing repeated inconsistent PG warnings, generally
>         on the order of
>         >>> 3-10 per week.
>         >>>
>         >>>      pg 2.b9 is active+clean+inconsistent, acting
>         [25,117,128,95,151,15]
>         >>>
>         >>>
>         >>
>         >>> Every time we look at them, we see the same checksum
>         (0x6706be76):
>         >>>
>         >>> debug 2020-08-13T18:39:01.731+0000 7fbc037a7700 -1
>         >>> bluestore(/var/lib/ceph/osd/ceph-25) _verify_csum bad
>         crc32c/0x1000
>         >>> checksum at blob offset 0x0, got 0x6706be76, expected
>         0x61f2021c, device
>         >>> location [0x12b403c0000~1000], logical extent 0x0~1000,
>         object
>         >>>
>         2#2:0f1a338f:::rbd_data.3.20d195d612942.0000000001db869b:head#
>         >>>
>         >>> This looks a lot like: https://tracker.ceph.com/issues/22464
>         >>> That said, we've got the following versions in play
>         (cluster was created
>         >>> with 15.2.3):
>         >>> ceph version 15.2.4
>         (7447c15c6ff58d7fce91843b705a268a1917325c) octopus
>         >>> (stable)
>         >>>
>         >>
>         >> --
>         >> May the most significant bit of your life be positive.
>         >>
>         > _______________________________________________
>         > ceph-users mailing list -- ceph-users(a)ceph.io
>         <mailto:ceph-users@ceph.io>
>         > To unsubscribe send an email to ceph-users-leave(a)ceph.io
>         <mailto:ceph-users-leave@ceph.io>
>         _______________________________________________
>         ceph-users mailing list -- ceph-users(a)ceph.io
>         <mailto:ceph-users@ceph.io>
>         To unsubscribe send an email to ceph-users-leave(a)ceph.io
>         <mailto:ceph-users-leave@ceph.io>
> 

2024

2023

2022

2021

2020

2019

[ceph-users] Re: ceph pgs inconsistent, always the same checksum