I'd like to configure a cache tier to act as a write buffer, so that if
writes come in, it promotes objects, but reads never promote an object. We
have a lot of cold data so we would like to tier down to an EC pool
(CephFS) after a period of about 30 days to save space. The storage tier
and the 'cache' tier would be on the same spindles, so the only performance
improvement would be from the faster writes with replication. So we don't
want to really move data between tiers.
The idea would be to not promote on read since EC read performance is good
enough and have writes go to the cache tier where the data may be 'hot' for
a week or so, then get cold.
It seems that we would only need one hit_set and if -1 can't be set for
min_read_recency_for_promote, I could probably use 2 which would never hit
because there is only one set, but that may error too. The follow up is how
big a set should be as it only really tells if an object "may" be in cache
and does not determine when things are flushed, so it really only matters
how out-of-date we are okay with the bloom filter being out of date, right?
So we could have it be a day long if we are okay with that stale rate? Is
there any advantage to having a longer period for a bloom filter? Now, I'm
starting to wonder if I even need a bloom filter for this use case, can I
get tiering to work without it and only use
cache_min_flush_age/cach_min_evict_age since I don't care about promoting
when there are X hits in Y time?
Thanks
----------------
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1