Hi cephers,
At the CLT meeting today there's been agreement to *make Ceph API tests
"required" *again for Pull Request to be merged:
- The current approach (*"honoring the agreement not to merge failing
PRs"*) is simply not working: PRs have been merged with API tests in
red. While most of these are harmless due to random failures (*we are
working to improve this*), other times API tests warned about real
issues... which eventually slipped into the code. [1]
<https://tracker.ceph.com/issues/47306> [2]
<https://tracker.ceph.com/issues/45717> [3]
<https://github.com/ceph/ceph/pull/36091>
- The cost & risk of debugging issues a posteriori is usually higher
than the pain of retriggering the API tests (*we are working to improve
this*).
- Ceph API tests, even with their downsides, are providing true
integration testing at CI time: this doesn't simply mean complex unit tests
or component testing, it means running a vstart Ceph cluster and actually
testing RADOS, RBD, RGW, CephFS...
*What does this mean?*
If Ceph API tests are in green, great! It's not that hard to achieve: ~*75%
PRs pass the Ceph API tests from the beginning.*
[image: image.png]
What if they *are NOT* passing?
[image: image.png]
From Github you may access Ceph API tests results in Jenkins by clicking in
*"Details"* and you'll see a report:
1. The test may fail due to multiple causes: issues in a Jenkins node,
Github repo fetching, "make" stage, ... (if this is the case you may easily
retrigger the Ceph API test by adding a comment to the PR with the
text "jenkins
test api").
2. If the failure actually happens as a result of the Ceph API tests
themselves, the report will look like this
<https://jenkins.ceph.com/job/ceph-api/2726/>:
[image: image.png]
From there:
- You can quickly check whether this has already been reported
<https://tracker.ceph.com/search?q=FAIL:%20test_all%20%28tasks.mgr.dashboard.test_rgw.RgwBucketTest%29&open_issues=1>
(a known issue or a flapping test) or otherwise raise a new issue report
<https://tracker.ceph.com/projects/mgr/issues/new?issue[subject]=FAIL:%20test_all%20%28tasks.mgr.dashboard.test_rgw.RgwBucketTest%29&issue[category_id]=182&issue[assigned_to_id]=&issue[description]=%3Cpre%3EPlease%20copy%20here%20the%20console%20error%20message%3C/pre%3E>
.
- If the failure looks like a flapping one, you may retrigger the tests.
- If, however, the failure is caused by an intentional change in
behaviour, please reach out to Dashboard team for help.
*What may you expect from the Dashboard team?*
- We are working to harden Ceph API tests, increase their coverage and
make them more stable. You may check our backlog
<https://pad.ceph.com/p/dashboard-api-test-improvements> of
improvements. You are welcome to contribute with ideas or, even better,
working code ;-)
- We are monitoring every day how Ceph API tests are doing: failure
rate, runtime, ...
- You can find us in #IRC (#ceph-dashboard), Github (@ceph/dashboard),
in this very mail-list or pinging us directly: Lenz (in CC) is the
component lead, Laura (in CC too) is taking care of Dashboard QA, or myself.
Kind regards,
Ernesto