Streamlining backports

List overview All Threads
Download

newer

older

How Can Delta-8 Supplement Help...

docs backports

Ernesto Puerta

22 Apr 2021 22 Apr '21

11:34 a.m.

Hi all, In order to improve the handling of backports (3 stable releases means x3 process amplification) and to focus attention on the ones really requiring thorough reviews, *what about using a label to indicate that the backport was conflict-less (e.g.: "clean-backport")? * It could start as a manual action, but if useful it could also be easily automated with a Github Action <https://github.com/marketplace/actions/multi-labeler> or directly from the ceph-backport script. Additionally, adding a size label (XL, L, M, S, ...) <https://github.com/marketplace?type=actions&query=label+size> might also help everyone quickly understand the complexity and/or when searching PRs in Github. [image: image.png] These are just 2 things we may start doing right now without much effort needed. Further improvements might be launching the "ceph-backport" script directly from the BackportBot or syncing Github-Redmine to deal with the backporting paperwork. What do you think? Kind Regards, Ernesto

Attachments:

image.png (image/png — 8.4 KB)
attachment.html (text/html — 1.6 KB)

Show replies by date

Ilya Dryomov

22 Apr 22 Apr

1:21 p.m.

On Thu, Apr 22, 2021 at 1:35 PM Ernesto Puerta <epuertat(a)redhat.com> wrote:

...

Hi Ernesto, Yeah, something like "backport:no-conflicts" and "backport:has-conflicts" would be good to have. I'd prefer two explicit labels to make searches easier and catch when automation suddenly stops working and nothing gets labeled. Going further, it would be helpful to distinguish fully clean backports (clean cherry-picks made with "git cherry-pick -x", entire master PR is taken -- no commits are added or dropped, etc). We sort of have that in the form of the "this backport was staged using ceph-backport.sh" comment, but when someone asks to pull in another PR/commit, those later commits are usually done by hand. In particular, folks may forget to use "git cherry-pick -x" and lose the "cherry picked from commit" annotations. Also, not everyone uses ceph-backport.sh script, so I think the automation needs to be independent of it.

...

Additionally, adding a size label (XL, L, M, S, ...) <https://github.com/marketplace?type=actions&query=label+size> might also help everyone quickly understand the complexity and/or when searching PRs in Github. [image: image.png]

This one I'm not sure about because diffstat size is a rather poor indicator of complexity. We already have quite a lot of labels, so we need be concious of cluttering that space with something that isn't really useful. That's just my opinion though -- others might like it ;) Thanks, Ilya

Loïc Dachary

3:29 p.m.

Hi Ernesto, Although I'm only recently returning to backports, I've worked on hundreds of them a few years ago. From the perspective of someone submitting a backport in an area of Ceph that they are not familiar with (that was rgw or Cephfs for me), it was almost impossible to assert the complexity of a backport. Cherry-pick conflicts were sometime environmental (i.e. a tiny change in the context failed the cherry-pick) and did not indicate a problematic backport. And some clean cherry-pick were deceptively complex. From a reviewer perspective I frequently had to take time and think carefully about the consequences of a backport, even when the pull request tests were all green and the cherry-pick was clean. I had to remember how things were back in version N-1 or N-2, evaluate one last time "is it really useful?", and think about all possible side effects. It would be useful to roughly estimate how much time is wasted by backports submitters and reviewers because of work that can trivially be automated. For instance, as a reviewer, how much time would you save if there was an "clean-backport" label on a given backport? My 2cts On 22/04/2021 13:34, Ernesto Puerta wrote:

...

quickly understand the complexity and/or when searching PRs in Github.

...

image.png These are just 2 things we may start doing right now without much effort needed. Further improvements might be launching the "ceph-backport" script directly from the BackportBot or syncing Github-Redmine to deal with the backporting paperwork. What do you think? Kind Regards, Ernesto _______________________________________________ Dev mailing list -- dev(a)ceph.io To unsubscribe send an email to dev-leave(a)ceph.io

-- Loïc Dachary, Artisan Logiciel Libre

Nathan Cutler

6:16 p.m.

Hi Ernesto: I fully concur with what Loic wrote, and just to add to that: Years ago, a lead developer gave us some general principles that all backports should ideally follow. These are codified: https://github.com/ceph/ceph/blob/master/SubmittingPatches-backports.rst#ge… but I think it's worth quoting them here. Each backport is supposed to specify: 1. what bug it is fixing 2. why this fix is the minimal way to do it 3. why does this need to be fixed in <release> Now, how good we are, as a project, at adhering to these principles is already pretty questionable. How will introducing more automation help us improve? Or maybe we should change the principles to say: "the Ceph project encourages commits to be backported from master indiscriminately without any justification or risk analysis"? I guess we would not change the stated principles as suggested, but I still think that, when deciding what kind of automation to introduce, we should ask ourselves questions like: How stable are our "stable" releases? Do we value stability over features, or vice versa? How often does the drive to backport stuff introduce regressions? How to gauge the riskiness of a given backport? Do the answers to these questions vary from one component to another, or can answers be formulated on a project-wide basis? Backporting stuff is necessary, but also risky. Automation, I think, can actually increase the frequency with which we unintentionally introduce regressions into stable releases because * automation, if successful, might tend to increase the overall number of backports * automation cannot provide any justification or estimate of risk, so it might also increase the number of backports that lack any justification or estimation of risk I'm skeptical of the value of labels, but I do think it would be useful to have Jenkins jobs checking: 1. whether the commits being cherry-picked are really in master 2. whether the master commits cherry-picked cleanly 3. whether the backport PR contains the same number of commits as the master PR These couldn't be "mandatory" checks because there are plenty of exceptions, but I think having this information there would be useful for reviewers (but I don't review backports so I don't know for sure). Nathan

Yuri Weinstein

6:38 p.m.

I'd like to add my 2c from the human testing standpoint. While we are discussing issues on how to improve PRs testing we also test numerous PRs semi-manually on a daily basis. Ever since we started adding labels automatically to PRs it actually became more difficult to understand from the auto generated labels what type of tests/suites are to be run. For example when we have labels "fs" and "core" that means that respective suites/tests are to be executed. That is good if those labels used were true, but in many cases it's simply not the case, e.g. no need to run 'fs' and 'rados' (in this example, only one would suffice). So if we do use automation to add labels to PRs it has to be true, otherwise we will be over utilizing resources for unneeded tests' runs. Thx YuriW On Thu, Apr 22, 2021 at 11:17 AM Nathan Cutler <ncutler(a)suse.com> wrote:

...

Ilya Dryomov

8:22 p.m.

On Thu, Apr 22, 2021 at 8:16 PM Nathan Cutler <ncutler(a)suse.com> wrote:

...

Your 1. and 2. and 3. is exactly what I meant by "fully clean backports" in my earlier reply. Whether it's a Github Action or a Jenkins job, whether the output is a set of labels or a comment in the PR, this information is very useful to someone who is reviewing backport PRs (especially if you do it in batches) because it allows to concentrate on the actual changes and not worry about checking the boilerplate. Even just "backport:no-conflicts" and "backport:has-conflicts" labels or comments would be a great help because sometimes people just forget to uncomment "Conflicts" in the commit message and something that conflicted always deserves a second look even if the resolution was trivial. There are cases when a clean cherry-pick needs adjustments, but those are rare and hopefully the check can catch those (e.g. if the backport modifies a file that wasn't modified in the original or if the final diffstat is too different from the original). Otherwise I share your and Loic's concerns about further automating the backports themselves. Many projects use bots for this which results to very aggressive and often careless backporting because all it takes is a comment addressed to a bot or a set label. I agree that we don't want to go there. Thanks, Ilya

Josh Durgin

9:52 p.m.

On 4/22/21 1:22 PM, Ilya Dryomov wrote:

...

On Thu, Apr 22, 2021 at 8:16 PM Nathan Cutler <ncutler(a)suse.com> wrote:

Well said, I agree with the concerns about keeping humans involved to understand the risk and necessity of a backport. This is an area we should be spending human effort on when considering and reviewing backports. I'm all for labor-saving automation, and I think Loic raises an important question - where are we spending lots of time, that can be automated (and as Nathan mentions, without introducing additional risk)? Checking the boilerplate like Nathan described is a great example - this does take significant time in my experience, and can be easily overlooked when many commits are involved. Automation (in the form of labels or jenkins, I have no opinion on the mechanism) does seem worthwhile there. I also like Ernesto's idea of running the backport scripts more frequently (e.g. with a bot or github action) to keep the tracker in sync with the current state of PRs. It's only a minor incovenience to have outdated tracker state, but if the automation is already there, why not use it? Josh

Ernesto Puerta

23 Apr 23 Apr

5:51 p.m.

Thank you folks for the feedback. Clearly this is a controversial matter. So be prepared, long reply ahead :'D @Ilya Dryomov <idryomov(a)gmail.com>om>, thanks for the naming hints. I'd definitely go for "backport:no-conflicts | has-conflicts". @Ilya, @Loïc, regarding the size labels, I totally agree: talking here about "complexity" was too ambitious. "Lines of Code" is a pretty simplistic metric to describe whether something might be more risky or not (one-liners can be fatal too <https://gist.github.com/aras-p/6224951>), but OTOH from a reviewer perspective that's a pretty decent a-priori indicator of the time/effort a PR review might require (especially when there were conflicts). Having this kind of a-priori hints may help us better decide where to allocate our efforts. As an test: a backporter/a reviewer/QA needs to approach these 2 PRs labeled this way: [image: image.png] [image: image.png] And now they get the following hints: [image: image.png] [image: image.png] We might debate whether these flags could bias reviewers in favor/against those, but it wouldn't be that different from when you see this and click "merge PR": [image: image.png] Regarding the concern about the number of labels, as raised by Loïc and Yuri, I checked the last 25 PRs and the median is 3 labels/PR (3.4 on average), and that including a cephadm batch PR with 14 labels! :D [image: image.png] Maybe it's just me but I don't think that 3 labels is too much, as long as it's useful. That said there are things we could improve here: - Remove less informative labels (e.g.: "pybind" appears in 40% of the PRs and there's no such team or component). - Check that labels are really useful. E.g.: is "needs-review" label improving reviews on stale PRs? Is "needs-rebase" really required? - Better use of colors or grouping tags. This is an example from kubernetes <https://github.com/kubernetes/kubernetes/pull/98152>: [image: image.png] @Nathan Cutler <ncutler(a)suse.com>om>, if we fear that automation might increase the number of regressions because it makes backport stuff easier, we should also stop using "ceph-backports" and "backport-create-issue" scripts, shouldn't we? Or, on the contrary, we might also add automation to block backports from non-bug trackers, which we are not doing now (BTW ML-based PR classification is nothing new and there are lots of interesting proposals [1] <https://swat.polymtl.ca/~foutsekh/docs/CASCON'08.pdf> [2] <https://people.csail.mit.edu/hunkim/images/3/37/Papers_kim_2007_bugcache.pdf> ). I'm skeptical of the value of labels, but I do think it would be useful to

...

have Jenkins jobs checking: 1. whether the commits being cherry-picked are really in master 2. whether the master commits cherry-picked cleanly 3. whether the backport PR contains the same number of commits as the master PR

I agree, although I think that's more the implementation detail. I also agree with you here: we should have a "commit sanity" check (e.g.: block merge commits, detect "Fixes" line, etc.), but in the meantime we may start with one and then switch to the other. @Yuri Weinstein <yweinste(a)redhat.com> interesting feedback. The current labels were mostly meant to dispatch PRs to the proper component bucket, but we may give it another thought for mapping that to QA suites. I remember Neha & Josh mentioned this in relation to this effort to bring teuthology runs to PR checks. As the number of topics seems to be growing, shall we schedule a quick meeting and review this? Thanks and nice weekend! Kind Regards, Ernesto On Thu, Apr 22, 2021 at 11:52 PM Josh Durgin <jdurgin(a)redhat.com> wrote:

...

On 4/22/21 1:22 PM, Ilya Dryomov wrote:

On Thu, Apr 22, 2021 at 8:16 PM Nathan Cutler <ncutler(a)suse.com> wrote: > > Hi Ernesto: > > I fully concur with what Loic wrote, and just to add to that: > > Years ago, a lead developer gave us some general principles that all

backports

> should ideally follow. These are codified: > >

https://github.com/ceph/ceph/blob/master/SubmittingPatches-backports.rst#ge…

> > but I think it's worth quoting them here. Each backport is supposed to

specify:

> > 1. what bug it is fixing > 2. why this fix is the minimal way to do it > 3. why does this need to be fixed in <release> > > Now, how good we are, as a project, at adhering to these principles is

already

> pretty questionable. How will introducing more automation help us

improve?

> Or maybe we should change the principles to say: "the Ceph project

encourages

> commits to be backported from master indiscriminately without any

justification

> or risk analysis"? > > I guess we would not change the stated principles as suggested, but I

still

> think that, when deciding what kind of automation to introduce, we

should ask

> ourselves questions like: > > How stable are our "stable" releases? > Do we value stability over features, or vice versa? > How often does the drive to backport stuff introduce regressions? > How to gauge the riskiness of a given backport? > Do the answers to these questions vary from one component to another,

or can

> answers be formulated on a project-wide basis? > > Backporting stuff is necessary, but also risky. Automation, I think, can > actually increase the frequency with which we unintentionally introduce > regressions into stable releases because > > * automation, if successful, might tend to increase the overall number

> backports > * automation cannot provide any justification or estimate of risk, so

it might

> also increase the number of backports that lack any justification or > estimation of risk > > I'm skeptical of the value of labels, but I do think it would be useful

to have

> Jenkins jobs checking: > > 1. whether the commits being cherry-picked are really in master > 2. whether the master commits cherry-picked cleanly > 3. whether the backport PR contains the same number of commits as the

master PR

> > These couldn't be "mandatory" checks because there are plenty of

exceptions, but

> I think having this information there would be useful for reviewers

(but I don't

review backports so I don't know for sure).

1106

days inactive

1107

days old

dev@ceph.io

Manage subscription

7 comments

6 participants

tags (0)

participants (6)

Ernesto Puerta
Ilya Dryomov
Josh Durgin
Loïc Dachary
Nathan Cutler
Yuri Weinstein