Hi Greg,

Thanks a lot for helping identify potential shortcomings to this.

"It's not easy to identify when satisfactory testing has been completed for a given PR (since teuthology is not integrated, and a run can have failures that do or don't matter for a given PR)"

"Mergify [...] seems like it works best when [...] the necessary reviews and test suites are very well-integrated into the GitHub API, are the sole qualifier for merging, and can be counted on to be accurate. Unfortunately, neither of those apply to the Ceph project as a whole."

That sounds to me like there are a couple of issues over there. So:

- Can we trigger package builds and schedule teuthology runs from GitHub (e.g: when qa-needed label is added)?

- Can we identify QA suites to run based on labels, paths, components, comments ("Jenkins run teuthology-suite..."), ...? Besides, if we add CODEOWNERS to Ceph components, we can reuse these mappings to pick QA suites.

- Can we either stabilize flapping tests or, in the meantime, flag/ignore/remove them?

- Can we publish teuthology results back to GitHub and, if all positive, remove qa-needed label?

To me this sounds like we already have there 90% of what we need to automate the whole dev workflow, but that 10% missing is blocking us from moving forward on CI.

Even if we can only implement one or two of the above steps I think we'll have quite improved the issues you stated before. And the automated process may always fall back to a manual one if some condition is not met.

Kind Regards,

Ernesto

El mar., 13 ago. 2019 0:22, Gregory Farnum <gfarnum@redhat.com> escribió:

On Fri, Aug 9, 2019 at 9:41 AM Ernesto Puerta <epuertat@redhat.com> wrote:
Thanks, Sage. I'll check that script. It's interesting to know how this has been done so far.

Yuri, thanks a lot for the detailed description of steps. So let's rephrase those ones in a different way: what should block a PR from being merged?
The approval of a component lead/team members?
Then let's put a "reviewed-by=@ceph/component-lead" condition. We just started to enforce this in Dashboard by using Code-Owners.
Integration/E2E tests passing?
As part of the PR checklist, both PR author or reviewers SHOULD identify whether a PR needs to pass a QA run and add the "qa-needed" label accordingly. In that case let's make "qa-needed" label block PR merges (or rely on the exiting rule for "DNM" label). Or we may create the opposite label "qa-not-needed" and assume that by default all PRs need QA?
We may use this Github Action to ensure the PR checklist is followed (it'll add an "Incomplete Tasks" label if some checkbox is not ticked).
Issue/Bug logging/status updates? Unfortunately this is out of the scope of Mergify, but it could be 'easily' automated via Github Hooks/Actions or Jenkins. I feel that we are missing workflows in Github issues, and maybe labels are used to fill that gap?
In any case, IMHO we should avoid falling back to unspecific conditions/labels, like using a "DNM"/"Ready-to-be-merged" label, as doing that would end up being more or less the same as clicking the Merge button.

I'm not sure how the dashboard development process works, but Yuri's questions and the issues they provoke worry me about any kind of automated merge system in the broader Ceph codebase. It's not easy to identify when satisfactory testing has been completed for a given PR (since teuthology is not integrated, and a run can have failures that do or don't matter for a given PR), and reviewed-by statements or Github approvals can be provided either before or after that teuthology testing is done, and sometimes contingent on it passing. (ie, "LGTM assuming tests pass")

I really don't see a way we can programmatically interrogate the necessary "win" conditions right now without requiring some kind of explicit label that is semantically equivalent to pushing the merge button. Plus, our leads already have workflows built up to automate some portion of those steps when they're comfortable.

From looking at how Mergify sells itself and is used in ceph-ansible, it seems like it works best when either
1) the merge conditions are very loose (ie, 1-2 approvals), or
2) the necessary reviews and test suites are very well-integrated into the GitHub API, are the sole qualifier for merging, and can be counted on to be accurate.

Unfortunately, neither of those apply to the Ceph project as a whole. :(
-Greg