On 4/6/20 10:24 AM, Sage Weil wrote:
On Wed, 25 Mar 2020, Ulrich Weigand wrote:
we're currently investigating to set up a Teuthology cluster to run the
Ceph integration test suite on IBM Z, to improve test coverage on our
However, we're not sure what hardware resources are required to do so. The
target configuration should be large enough to comfortably support running
an instance of the full Ceph integration tests. Is there some data
available from your experience with such installations on how large this
cluster needs to be then?
In particular, what number of nodes, #cpus and memory per node, number
(type/size) of disks that should be attached?
Thanks for any data / estimates you can provide!
I can provide some estimates and
First, the test suites are currently targetted to run on 'smithi' nodes,
which are relatively low-powered x86 1u machines with a single NVMe
divided into 4 scratch LVs (+ an HDD for boot + logs). (This is somewhat
arbitrary--it's just the hardware we picked so the tests are written to
Each test tends to take anywhere from 15m to 2h to run (with a few
outliers that take longer). Each test suite is somewhere between 100 and
400 tests. There are maybe 10 different suites we run with some
regularly, with a few (e.g., rados) taking up a larger portion of the
machine time. We currently have about 175 healthy smithi in service to
I've been telling the aarch64 folks that we probably want at least 25-50
similarly-sized nodes in order to run the test suites in a reasonable
amount of time (e.g., minimal rados suite ~day and not days).
I'm not really sure how this maps on the Z hardware, but hopefully this
provides some guidance!
FWIW, the hardware wasn't purchased arbitrarily. Before we purchased
the smithi nodes we did some analysis of the runtime and behavior of the
existing teuthology runs specifically to understand what hardware should
be purchased to complete tests as efficiently as possible. You can see
the report here:
At the time, the RADOS and Upgrade suites were by far the biggest
consumer of resources. The majority of time in both of those suites was
spent in the RADOS task (though data transfer and log compression did
make up a somewhat significant chunk of runtime). One of the reasons we
specifically went for a single fast NVMe drive in those nodes was that
simulated "thrashing" rados workloads completed much faster with a
single NVMe drive vs 4 independent HDDs. In both cases CPU usage did
not appear to be the dominating factor.
While the data is quite old now, the general trends are likely still
true. Utilizing (enterprise grade) flash should accelerate the tests
significantly. Beyond that, on X86 we expect to see up to 200-500MB/s
and 1000-3000 IOPS per core depending on specifics of the HW and version
of Ceph being tested. I don't know how power cores would compare
exactly, but that at least provides a rough ballpark. Each OSD should
have at least 4GB of memory with some extra reserved for temporary
spikes or delayed kernel reclaim of released pages.
Dev mailing list -- dev(a)ceph.io
To unsubscribe send an email to dev-leave(a)ceph.io