Hello,
I am observing non-intuitive results for a performance test using the S3 API to RGW. I
am wondering if others have similar experiences or knowledge here.
Our application is using the “if-none-match” header on S3-API requests. This header is
set by the application if it already has a copy of the object in question but wishes to
check if there is a newer version. If the etag of the current object matches then RGW
sends a 304 response, and if it doesn’t it sends the updated content of the object.
We’re observing that the response time of requests resulting in “304 Not Modified” is
typically slower than those for normal object retrieval. This wasn’t intuitive to me – in
the 304 case there is no content to transfer over the network and I would expect the
request can be satisfied just by looking at the RGW index (I was under the impression that
metadata including etag is in the index). Anecdotally, HEAD requests see similar results
but I haven't yet analysed in full.
Does anyone else have data or experience about expected performance of this scenario? Are
there any potential avenues for optimization of configuration ? What kind of commands can
I use to debug this further ?
Some details of the current setup:
=> ceph version 14.2.5 (ad5bd132e1492173c85fda2cc863152730b16a92) nautilus (stable)
=> Objects are typically 80-100KB.
=> Versioning is enabled on the bucket.
=> Our requests specify a Range header (hence will generate 206 not 200).
=> Multisite features are enabled.
=> Bucket has 20 shards – I’ve put a dump of "bucket limits" below.
Performance results
Response, Request Count, Median, 75th percentile, 90th percentile, 95th percentile,
206 Partial, 20473, 3, 3, 16, 129, 1200
304 Not Modified, 15644, 9, 16, 46, 212, 1192
Bucket details
{
"bucket": "albansstack-scsdata",
"tenant": "",
"num_objects": 465780,
"num_shards": 20,
"objects_per_shard": 23289,
"fill_status": "OK"
},
Many thanks,
Alistair.
Show replies by date