Hi Xiubo,
Is there any way to possibly get a PR development release we could
upgrade to, in order to test and see if the lock order bug per Bug
#62123 could be the answer? Although I'm not sure that bug has been
fixed yet?
-erich
On 4/21/24 9:39 PM, Xiubo Li wrote:
> Hi Erich,
>
> I raised one tracker for this
https://tracker.ceph.com/issues/65607.
>
> Currently I haven't figured out where was holding the 'dn->lock' in
the
> 'lookup' request or somewhere else, since there is not debug log.
>
> Hopefully we can get the debug logs, which we can push it further.
>
> Thanks
>
> - Xiubo
>
> On 4/19/24 23:55, Erich Weiler wrote:
>> Hi Xiubo,
>>
>> Nevermind I was wrong, most the blocked ops were 12 hours old. Ug.
>>
>> I restarted the MDS daemon to clear them.
>>
>> I just reset to having one active MDS instead of two, let's see if
>> that makes a difference.
>>
>> I am beginning to think it may be impossible to catch the logs that
>> matter here. I feel like sometimes the blocked ops are just waiting
>> because of load and sometimes they are waiting because they are stuck.
>> But, it's really hard to tell which, without waiting a while. But, I
>> can't wait while having debug turned on because my root disks (which
>> are 150 GB large) fill up with debug logs in 20 minutes. So it almost
>> seems that unless I could somehow store many TB of debug logs we won't
>> be able to catch this.
>>
>> Let's see how having one MDS helps. Or maybe I actually need like 4
>> MDSs because the load is too high for only one or two. I don't know.
>> Or maybe it's the lock issue you've been working on. I guess I can
>> test the lock order fix when it's available to test.
>>
>> -erich
>>
>> On 4/19/24 7:26 AM, Erich Weiler wrote:
>>> So I woke up this morning and checked the blocked_ops again, there
>>> were 150 of them. But the age of each ranged from 500 to 4300
>>> seconds. So it seems as if they are eventually being processed.
>>>
>>> I wonder if we are thinking about this in the wrong way? Maybe I
>>> should be *adding* MDS daemons because my current ones are overloaded?
>>>
>>> Can a single server hold multiple MDS daemons? Right now I have
>>> three physical servers each with one MDS daemon on it.
>>>
>>> I can still try reducing to one. And I'll keep an eye on blocked ops
>>> to see if any get to a very old age (and are thus wedged).
>>>
>>> -erich
>>>
>>> On 4/18/24 8:55 PM, Xiubo Li wrote:
>>>> Okay, please try it to set only one active mds.
>>>>
>>>>
>>>> On 4/19/24 11:54, Erich Weiler wrote:
>>>>> We have 2 active MDS daemons and one standby.
>>>>>
>>>>> On 4/18/24 8:52 PM, Xiubo Li wrote:
>>>>>> BTW, how man active mds you are using ?
>>>>>>
>>>>>>
>>>>>> On 4/19/24 10:55, Erich Weiler wrote:
>>>>>>> OK, I'm sure I caught it in the right order this time,
the logs
>>>>>>> should definitely show when the blocked/slow requests start.
>>>>>>> Check out these logs and dumps:
>>>>>>>
>>>>>>>
http://hgwdev.gi.ucsc.edu/~weiler/
>>>>>>>
>>>>>>> It's a 762 MB tarball but it uncompresses to 16 GB.
>>>>>>>
>>>>>>> -erichll
>>>>>>>
>>>>>>>
>>>>>>> On 4/18/24 6:57 PM, Xiubo Li wrote:
>>>>>>>> Okay, could you try this with 18.2.0 ?
>>>>>>>>
>>>>>>>> I just double it was introduce by:
>>>>>>>>
>>>>>>>> commit e610179a6a59c463eb3d85e87152ed3268c808ff
>>>>>>>> Author: Patrick Donnelly <pdonnell(a)redhat.com>
>>>>>>>> Date: Mon Jul 17 16:10:59 2023 -0400
>>>>>>>>
>>>>>>>> mds: drop locks and retry when lock set changes
>>>>>>>>
>>>>>>>> An optimization was added to avoid an unnecessary
gather on
>>>>>>>> the inode
>>>>>>>> filelock when the client can safely get the file
size
>>>>>>>> without also
>>>>>>>> getting issued the requested caps. However, if a
retry of
>>>>>>>> getattr
>>>>>>>> is necessary, this conditional inclusion of the
inode filelock
>>>>>>>> can cause lock-order violations resulting in
deadlock.
>>>>>>>>
>>>>>>>> So, if we've already acquired some of the
inode's locks
>>>>>>>> then we must
>>>>>>>> drop locks and retry.
>>>>>>>>
>>>>>>>> Fixes:
https://tracker.ceph.com/issues/62052
>>>>>>>> Fixes: c822b3e2573578c288d170d1031672b74e02dced
>>>>>>>> Signed-off-by: Patrick Donnelly
<pdonnell(a)redhat.com>
>>>>>>>> (cherry picked from commit
>>>>>>>> b5719ac32fe6431131842d62ffaf7101c03e9bac)
>>>>>>>>
>>>>>>>>
>>>>>>>> On 4/19/24 09:54, Erich Weiler wrote:
>>>>>>>>> I'm on 18.2.1. I think I may have gotten the
timing off on the
>>>>>>>>> logs and dumps so I'll try again. Just really
hard to capture
>>>>>>>>> because I need to kind of be looking at it in real
time to
>>>>>>>>> capture it. Hang on, lemme see if I can get another
capture...
>>>>>>>>>
>>>>>>>>> -erich
>>>>>>>>>
>>>>>>>>> On 4/18/24 6:35 PM, Xiubo Li wrote:
>>>>>>>>>>
>>>>>>>>>> BTW, which ceph version you are using ?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 4/12/24 04:22, Erich Weiler wrote:
>>>>>>>>>>> BTW - it just happened again, I upped the
debugging settings
>>>>>>>>>>> as you instructed and got more dumps (then
returned the debug
>>>>>>>>>>> settings to normal).
>>>>>>>>>>>
>>>>>>>>>>> Attached are the new dumps.
>>>>>>>>>>>
>>>>>>>>>>> Thanks again,
>>>>>>>>>>> erich
>>>>>>>>>>>
>>>>>>>>>>> On 4/9/24 9:00 PM, Xiubo Li wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> On 4/10/24 11:48, Erich Weiler wrote:
>>>>>>>>>>>>>>> Dos that mean it could be the
locker order bug
>>>>>>>>>>>>>>>
(
https://tracker.ceph.com/issues/62123) as Xiubo suggested?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I have raised one PR to fix the
lock order issue, if
>>>>>>>>>>>>>> possible please have a try to see
could it resolve this
>>>>>>>>>>>>>> issue.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thank you! Yeah, this issue is
happening every couple days
>>>>>>>>>>>>> now. It just happened again today and
I got more MDS dumps.
>>>>>>>>>>>>> If it would help, let me know and I
can send them!
>>>>>>>>>>>>>
>>>>>>>>>>>> Once this happen if you could enable the
mds debug logs will
>>>>>>>>>>>> be better:
>>>>>>>>>>>>
>>>>>>>>>>>> debug mds = 20
>>>>>>>>>>>>
>>>>>>>>>>>> debug ms = 1
>>>>>>>>>>>>
>>>>>>>>>>>> And then provide the debug logs together
with the MDS dumps.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> I assume if this fix is approved and
backported it will
>>>>>>>>>>>>> then appear in like 18.2.3 or
something?
>>>>>>>>>>>>>
>>>>>>>>>>>> Yeah, it will be backported after being
well tested.
>>>>>>>>>>>>
>>>>>>>>>>>> - Xiubo
>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks again,
>>>>>>>>>>>>> erich
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>
>