Le mer. 29 avr. 2020 à 15:59, Willem Jan Withagen <wjw@digiware.nl> a écrit :
On 29-4-2020 03:46, kefu chai wrote:
> On Fri, Apr 24, 2020 at 6:17 PM Willem Jan Withagen <wjw@digiware.nl> wrote:
>> On 20-4-2020 18:07, kefu chai wrote:
>>
>>
>> Le lun. 20 avr. 2020 à 19:53, Willem Jan Withagen <wjw@digiware.nl> a écrit :
>>> On 20-4-2020 13:26, kefu chai wrote:
>>>> On Sun, Apr 19, 2020 at 7:00 PM Willem Jan Withagen <wjw@digiware.nl> wrote:
>>>>> Hi Kefu,
>>>>>
>>>>> This looks like a possible not correctly initialised difference?
>>>>> Am I correct in assuming that?
>>>>>
>>>>> Or suggestions to debug this?
>>>> i think you already found the PR addressing this issue and filed
>>>> https://tracker.ceph.com/issues/45130?
>>>>
>>>> anything i am missing?
>>> That PR was about the check-generated script not being able to set
>>> the return result in case of failure. Due to Bash creating a subshell
>>> for the while-loop, and thus putting counting variables in a different
>>> scope. Which you fixed in this PR.
>>>
>>> Once Fixed, I'm getting errors reported when running the script for testing
>>>       RGWObjManifest
>>> and later on
>>>       bluestore_bdev_label_t
>>>
>>> So for these 2 cases `dump_json` and `encode decode dump_json` give
>>> different results.
>>> I very much suspect that it could be that there is a difference bewween
>>> initializing an object
>>> and decoding an object in the way some fields are handled
>>>
>>> But I haven't found that (yet).
>>
>> I see. Willem, can you see the same issue on master or octopus?
>>
>>
>> Hi Kefu,
>>
>> So fixing the bluestore_bdev_label_t error only requires backporting # 29968
>> Fixing the error with RGWObjManifest is done in #29862, but requires quite some
>> more backports for all fields of RGWObjManifest and children to actually get it fixed.
>>
>> So I submitted a tracker to backport 29968
>> Getting #29862 to patch in Nautilus will need quite some fixing, and thus require
>> a specific patch on Nautilus. And then still it'll require quite some more backports.
>>
>> So for "fixing" the RGWObjManifes, I'm currently running my FreeBSD tests with a patch
>> that fixes the testing loop like in #29862, but then excludes this test in Nautilus.
>>
>> If that is acceptable for a patch on Nautilus, I'll submit that.
> hi Willem, thanks for the investigations. nautilus is not EOL, so a
> patch is always acceptable i think. but "quite some fixing" and "quite
> some backports" are kind of worrying me, what do you mean by "quite
> some", are they involving tremendous work for preparing the fix only
> for addressing the test failure or they are indeed bug fixing which
> address issues we could be facing in production?

To start with the last point: No, I do not expect that there is any
impact on production.
So we could also try to ignore just these tests.

#29968 is a nobrainer to backport.
#29862 required extra fixes. (quite some might be overstated)
Since a pactch might be acceptable, I'll put some efforts in it.

Can I create ONE PR that holds several cherry-picks, and some custom
commits?

Sure. I think it’s fine. Presumably they are not touching lots of different components in Ceph.


Otherwise I'll just use #29862 as basis to create a new PR.

--WjW

>> --WjW
>>
>>>
>>> --WjW
>>>
>>>>> Thanx,
>>>>> --WjW
>>>>>
>>>>> Start 94: check-generated.sh
>>>>> 1/2 Test #94: check-generated.sh ...............***Failed 58.52 sec
>>>>> Enivronment Variables Already Set
>>>>> checking ceph-dencoder generated test instances...
>>>>> numgen type
>>>>> ................................
>>>>> 2 RGWOLHInfo
>>>>> 2 RGWObjManifest
>>>>> /tmp/typ-biXq5mHqd /tmp/typ-m55peHsmj differ: char 8124, line 278
>>>>> **** RGWObjManifest test 1 dump_json check failed ****
>>>>> ceph-dencoder type RGWObjManifest select_test 1 dump_json >
>>>>> /tmp/typ-biXq5mHqd
>>>>> ceph-dencoder type RGWObjManifest select_test 1 encode decode dump_json
>>>>>    > /tmp/typ-m55peHsmj
>>>>> 278c278
>>>>> < "name": "",
>>>>> ---
>>>>>>                        "name": "0",
>>>>> 280c280
>>>>> < "ns": ""
>>>>> ---
>>>>>>                        "ns": "shadow"
>>>>> 294c294
>>>>> < "ofs": 0,
>>>>> ---
>>>>>>            "ofs": 5242880,
>>>>> 314c314
>>>>> < "name": "",
>>>>> ---
>>>>>>                        "name": "0",
>>>>> 316c316
>>>>> < "ns": ""
>>>>> ---
>>>>>>                        "ns": "shadow"
>>>>> Start 112: unittest_journal
>>>>> 2/2 Test #112: unittest_journal ................. Passed 5.47 sec
>>>>>
>>>>> 50% tests passed, 1 tests failed out of 2
>>>>>
>>>>> Total Test time (real) = 64.04 sec
>>>>>
>>>>> The following tests FAILED:
>>>>> 94 - check-generated.sh (Failed)
>>>>> Errors while running CTest
>>>>> Build step 'Execute shell' marked build as failure
>>>>
>> --
>> Regards
>> Kefu Chai
>>
>>
>

--
Regards
Kefu Chai