Hi Ulrich,
On Wed, 25 Mar 2020, Ulrich Weigand wrote:
Hello,
we're currently investigating to set up a Teuthology cluster to run the
Ceph integration test suite on IBM Z, to improve test coverage on our
platform.
However, we're not sure what hardware resources are required to do so. The
target configuration should be large enough to comfortably support running
an instance of the full Ceph integration tests. Is there some data
available from your experience with such installations on how large this
cluster needs to be then?
In particular, what number of nodes, #cpus and memory per node, number
(type/size) of disks that should be attached?
Thanks for any data / estimates you can provide!
I can provide some estimates and general guidance.
First, the test suites are currently targetted to run on 'smithi' nodes,
which are relatively low-powered x86 1u machines with a single NVMe
divided into 4 scratch LVs (+ an HDD for boot + logs). (This is somewhat
arbitrary--it's just the hardware we picked so the tests are written to
target that.)
Each test tends to take anywhere from 15m to 2h to run (with a few
outliers that take longer). Each test suite is somewhere between 100 and
400 tests. There are maybe 10 different suites we run with some
regularly, with a few (e.g., rados) taking up a larger portion of the
machine time. We currently have about 175 healthy smithi in service to
support developers.
I've been telling the aarch64 folks that we probably want at least 25-50
similarly-sized nodes in order to run the test suites in a reasonable
amount of time (e.g., minimal rados suite ~day and not days).
I'm not really sure how this maps on the Z hardware, but hopefully this
provides some guidance!
sage