Thank you folks for the feedback. Clearly this is a controversial matter.
So be prepared, long reply ahead :'D
@Ilya Dryomov <idryomov(a)gmail.com>om>, thanks for the naming hints. I'd
definitely go for "backport:no-conflicts | has-conflicts".
@Ilya, @Loïc, regarding the size labels, I totally agree: talking here
about "complexity" was too ambitious. "Lines of Code" is a pretty
simplistic metric to describe whether something might be more risky or
not (one-liners
can be fatal too <https://gist.github.com/aras-p/6224951>), but OTOH from a
reviewer perspective that's a pretty decent a-priori indicator of the
time/effort a PR review might require (especially when there were
conflicts). Having this kind of a-priori hints may help us better decide
where to allocate our efforts.
As an test: a backporter/a reviewer/QA needs to approach these 2 PRs
labeled this way:
[image: image.png]
[image: image.png]
And now they get the following hints:
[image: image.png]
[image: image.png]
We might debate whether these flags could bias reviewers in favor/against
those, but it wouldn't be that different from when you see this and click
"merge PR":
[image: image.png]
Regarding the concern about the number of labels, as raised by Loïc and
Yuri, I checked the last 25 PRs and the median is 3 labels/PR (3.4 on
average), and that including a cephadm batch PR with 14 labels! :D
[image: image.png]
Maybe it's just me but I don't think that 3 labels is too much, as long as
it's useful. That said there are things we could improve here:
- Remove less informative labels (e.g.: "pybind" appears in 40% of the
PRs and there's no such team or component).
- Check that labels are really useful. E.g.: is "needs-review" label
improving reviews on stale PRs? Is "needs-rebase" really required?
- Better use of colors or grouping tags. This is an example from
kubernetes <https://github.com/kubernetes/kubernetes/pull/98152>:
[image: image.png]
@Nathan Cutler <ncutler(a)suse.com>om>, if we fear that automation might
increase the number of regressions because it makes backport stuff easier,
we should also stop using "ceph-backports" and
"backport-create-issue"
scripts, shouldn't we? Or, on the contrary, we might also add automation to
block backports from non-bug trackers, which we are not doing now
(BTW ML-based PR classification is nothing new and there are lots of
interesting proposals [1]
<https://swat.polymtl.ca/~foutsekh/docs/CASCON'08.pdf> [2]
<https://people.csail.mit.edu/hunkim/images/3/37/Papers_kim_2007_bugcache.pdf>
).
I'm skeptical of the value of labels, but I do think it would be useful to
have
Jenkins jobs checking:
1. whether the commits being cherry-picked are really in master
2. whether the master commits cherry-picked cleanly
3. whether the backport PR contains the same number of commits as the
master PR
I agree, although I think that's more the implementation detail. I also
agree with you here: we should have a "commit sanity" check (e.g.: block
merge commits, detect "Fixes" line, etc.), but in the meantime we may start
with one and then switch to the other.
@Yuri Weinstein <yweinste(a)redhat.com> interesting feedback. The current
labels were mostly meant to dispatch PRs to the proper component bucket,
but we may give it another thought for mapping that to QA suites. I
remember Neha & Josh mentioned this in relation to this effort to bring
teuthology runs to PR checks.
As the number of topics seems to be growing, shall we schedule a quick
meeting and review this?
Thanks and nice weekend!
Kind Regards,
Ernesto
On Thu, Apr 22, 2021 at 11:52 PM Josh Durgin <jdurgin(a)redhat.com> wrote:
On 4/22/21 1:22 PM, Ilya Dryomov wrote:
On Thu, Apr 22, 2021 at 8:16 PM Nathan Cutler
<ncutler(a)suse.com> wrote:
>
> Hi Ernesto:
>
> I fully concur with what Loic wrote, and just to add to that:
>
> Years ago, a lead developer gave us some general principles that all
backports
> should ideally follow. These are codified:
>
>
https://github.com/ceph/ceph/blob/master/SubmittingPatches-backports.rst#ge…
>
> but I think it's worth quoting them here. Each backport is supposed to
specify:
>
> 1. what bug it is fixing
> 2. why this fix is the minimal way to do it
> 3. why does this need to be fixed in <release>
>
> Now, how good we are, as a project, at adhering to these principles is
already
> pretty questionable. How will introducing
more automation help us
improve?
> Or maybe we should change the principles to
say: "the Ceph project
encourages
> commits to be backported from master
indiscriminately without any
justification
> or risk analysis"?
>
> I guess we would not change the stated principles as suggested, but I
still
> think that, when deciding what kind of
automation to introduce, we
should ask
> ourselves questions like:
>
> How stable are our "stable" releases?
> Do we value stability over features, or vice versa?
> How often does the drive to backport stuff introduce regressions?
> How to gauge the riskiness of a given backport?
> Do the answers to these questions vary from one component to another,
or can
> answers be formulated on a project-wide
basis?
>
> Backporting stuff is necessary, but also risky. Automation, I think, can
> actually increase the frequency with which we unintentionally introduce
> regressions into stable releases because
>
> * automation, if successful, might tend to increase the overall number
of
> backports
> * automation cannot provide any justification or estimate of risk, so
it
might
> also increase the number of backports that
lack any justification or
> estimation of risk
>
> I'm skeptical of the value of labels, but I do think it would be useful
to
have
> Jenkins jobs checking:
>
> 1. whether the commits being cherry-picked are really in master
> 2. whether the master commits cherry-picked cleanly
> 3. whether the backport PR contains the same number of commits as the
master
PR
>
> These couldn't be "mandatory" checks because there are plenty of
exceptions, but
> I think having this information there would
be useful for reviewers
(but I don't
review
backports so I don't know for sure).
Your 1. and 2. and 3. is exactly what I meant by "fully clean
backports" in my earlier reply. Whether it's a Github Action or a
Jenkins job, whether the output is a set of labels or a comment in
the PR, this information is very useful to someone who is reviewing
backport PRs (especially if you do it in batches) because it allows
to concentrate on the actual changes and not worry about checking
the boilerplate.
Even just "backport:no-conflicts" and "backport:has-conflicts"
labels
or comments would be a great help because sometimes people just forget
to uncomment "Conflicts" in the commit message and something that
conflicted always deserves a second look even if the resolution was
trivial. There are cases when a clean cherry-pick needs adjustments,
but those are rare and hopefully the check can catch those (e.g. if
the backport modifies a file that wasn't modified in the original or
if the final diffstat is too different from the original).
Otherwise I share your and Loic's concerns about further automating the
backports themselves. Many projects use bots for this which results to
very aggressive and often careless backporting because all it takes is
a comment addressed to a bot or a set label. I agree that we don't want
to go there.
Well said, I agree with the concerns about keeping humans involved to
understand the risk and necessity of a backport. This is an area
we should be spending human effort on when considering and reviewing
backports.
I'm all for labor-saving automation, and I think Loic raises an
important question - where are we spending lots of time, that can
be automated (and as Nathan mentions, without introducing additional risk)?
Checking the boilerplate like Nathan described is a great example - this
does take significant time in my experience, and can be easily
overlooked when many commits are involved. Automation (in the form of
labels or jenkins, I have no opinion on the mechanism) does seem
worthwhile there.
I also like Ernesto's idea of running the backport scripts more
frequently (e.g. with a bot or github action) to keep the tracker
in sync with the current state of PRs. It's only a minor incovenience to
have outdated tracker state, but if the automation is already there, why
not use it?
Josh