Hey All,
This Sunday (21/2) I plan to upgrade both jenkins.ceph.com and
2.jenkins.ceph.com to version 2.263.4
The expected downtime is approximately one hour
I will send another notice once it's done
Thanks,
--
Adam Kraitman
Systems Administrator
Ceph Engineering
IRC: akraitma
Hello folks, this past summer Shraddha Agrawal implemented a new way for
teuthology to run tests - a single process, teuthloogy-dispatcher,
locking and then running jobs, rather than a bunch of workers competing
for locks [0].
Since there's a single dispatcher for each queue, jobs are run in strict
priority order. This also enables a couple improvements to the test
experience:
1) jobs may require more nodes - since only one job is locking at a
time, they cannot be starved of available nodes
2) dead jobs will have full logs - jobs that hit the max_job_time (12
hours in sepia) will have full ceph logs and coredumps collected as
usual - this should help quite a bit with stabilizing pacific
For more details, check out the PR [1].
This is now running all the queues in the sepia lab - let us know if
you run into any bugs!
And thanks to Shraddha for her hard work on this!
Josh
[0] https://ceph.io/gsoc-2020/#teuthology-scheduling%20Improvements
[1] https://github.com/ceph/teuthology/pull/1546