Sepia August 2020

sepia@ceph.io

4 participants
3 discussions

by thomaswilliam2444＠gmail.com

If you want to score A+ grade in your college assignment then allassignmenthelp is the best among the best. They are having the assignment experts, who are ready to provide you the best assignment help at lowest charges. [url=https://www.allassignmenthelp.com/ca/do-my-homework.html]do my homework online[/url] | [url=https://www.allassignmenthelp.com/samples/crisis-management-plan-and-go… of bilingual children[/url]

2 months, 3 weeks

Planned Outage this Wed August 19

by David Galloway

As you may know, the Sepia Long Running Cluster has been hitting capacity limits over the past week or so. This has resulted in service disruptions to teuthology runs, chacra.ceph.com, docker-mirror.front.sepia.ceph.com, and quay.ceph.io. We've been able to get by by deleting/compressing logs more aggressively but it's not ideal or sustainable. Patrick has created a new erasure coded pool/filesystem that will allow us to keep the same amount of logs but use less space. In order to have teuthology workers start writing logs to that pool, we need to take an outage. At 0400 UTC 19AUG2020, I will instruct all teuthology workers to die after their running jobs finish. At 1300 UTC, I will kill any jobs that are still running. This gives the lab 9 hours to gracefully shut down. At that point, we will switch the mountpoint on teuthology.front over to the new EC pool and start storing new logs there. At the same time, Patrick will start migrating logs on the existing/old pool to the new pool. This means that logs from 7/20 through 8/19 will be unavailable (you'll see 404s) via the Pulpito web UI and qa-proxy URLs until they're migrated to the new EC pool. Let me know if you have any questions/concerns. Thanks, -- David Galloway Systems Administrator, RDU Ceph Engineering IRC: dgalloway

3 years, 8 months

Reducing test logs retention

by David Galloway

Hi all, Due to increases in amount of testing and length of logs, the (Long Running) Ceph cluster in the Sepia lab has been reaching 95-98% capacity over the past few days. Since almost everything else got deleted on the cluster a few months ago, I need to reduce the amount of test logs we keep on hand. Currently we: - Keep 14 days of passed job logs - Compress failed job logs older than 30 days - Delete failed job logs older than 365 days We will now be deleting failed job logs older than 300 days. We may be able to increase the cluster's capacity with the purchase of additional hardware which I will discuss with the appropriate stakeholders. -- David Galloway Systems Administrator, RDU Ceph Engineering IRC: dgalloway

3 years, 8 months

2024

2023

2022

2021

2020

2019

Sepia August 2020