On Wed, Feb 3, 2021 at 2:17 PM Gregory Farnum <gfarnum@redhat.com> wrote:
These interfaces seem a lot more complicated and require a lot of
management outside the messenger — besides simply dealing with
buffers, we need to handle negotiating compression techniques and
keeping them uniform across servers running different versions. That
sounds like a lot of pain to me.

I think the negotiation can be simple if it is simply based on OSDMap flags and we assume that specific versions of Ceph have the same supported codecs.

If the main concern is re-compressing replicated OSD data, and given
that we're using bufferlists and bufferptrs to share that data amongst
the messages anyway, perhaps we should just do memoization on those
data structures when we compres?

Yeah, I think this would be the main other approach we should consider.  My concern is that the buffer(list) code is already super complicated and I worry about overgeneralizing this. Mostly we need bufferlists for encoding and moving things around in memory and there are relatively few cases where we have large data buffers that may or may not be compressed (this is the only one, currently).  Having bufferlist cache crcs, for instance, is still something that I have regrets about in retrospect.

Hmm, which makes me wonder: if we widen the Message set_data/get_data interface to include compression disposition, I wonder if it should also include a checksum.  I think it could then also cover the one case where we still rely on the bufferlist crc caching.  (FileStore writes to the journal were the other, but we are probably past worrying about that now.)

sage