For developers submitting jobs using teuthology, we now have
recommendations on what priority level to use:
https://docs.ceph.com/docs/master/dev/developer_guide/#testing-priority
--
Patrick Donnelly, Ph.D.
He / Him / His
Senior Software Engineer
Red Hat Sunnyvale, CA
GPG: 19F28A586F808C2402351B93C3301A3E258DD79D
Hello
We are planning to start QE validation release next week.
If you have PRs that are to be part of it, please let us know by
adding "needs-qa" for 'quincy' milestone ASAP.
Thx
YuriW
Hi Folks,
I'm Kevin Zhao from Linaro and cc'd Qinfei Liu from Huawei, he is the
author of this PR. We are working on building and supporting Ceph on
openEuler OS.
OpenEuler (https://www.openeuler.org/en/) is now a very active and popular
operating system community. It is based on RPM and all the packages are
built from source. openEuler now attracts a lot of end users and
developers, especially in China. Actually some companies already have some
trial deployment clusters for Ceph on openEuler.
The PR Link: https://github.com/ceph/ceph/pull/50182
So from the variety of the Ceph upstream, it's better to support openEuler
build in the community. So we have proposed the PR here and hope to see the
feedback. And also want to know the *process for Ceph to support a real new
operation system*.
One thing I want to mention is that we have submitted the patch to Ceph
16.2.7 brance first because in OpenEuler 22.03-LTS, the Ceph version is
16.2.7, and we will also submit another PR to the Main branch.
Hope to hear from the community's feedback soon, thanks in advance!
--
*Best Regards*
*Kevin Zhao*
Tech Lead, LDCG Cloud Infra & Storage
Linaro Vertical Technologies
IRC(freenode): kevinz
Slack(kubernetes.slack.com): kevinz
kevin.zhao(a)linaro.org | Mobile/Direct/Wechat: +86 18818270915
reviving this old thread about http clients after reading through
https://github.com/RobertLeahy/ASIO-cURL and discovering the
"multi-socket flavor" of the libcurl-multi API documented in
https://curl.se/libcurl/c/libcurl-multi.html
rgw's existing RGWHTTPManager uses the older flavor of libcurl-multi,
which requires a background thread that polls libcurl for io and
completions. this new flavor allows us to do all of the polling and
timers asynchronously with asio, and only call into libcurl for
non-blocking io when the sockets are ready to read/write. getting rid
of the background thread makes it much easier to integrate with asio
applications, because it removes many complications around locking and
object lifetimes
i experimented with this multi-socket API by building my own asio
integration in https://github.com/cbodley/ceph/pull/6. there are two
main reasons i find this especially interesting:
1) we've been doing some prototyping for multisite sync with asio's
c++20 coroutines. RGWHTTPManager only supports the
optional_yield-style coroutines, so we were talking about using beast
for this initial prototype. however, i listed several of beast's
missing features earlier in this thread (mainly timeouts and
connection pooling), so this new curl client could be a much better
fit here
2) curl can be built with HTTP/3 support, and that's what we've been
using to test rgw's prototype frontend in
https://github.com/ceph/ceph/pull/48178. we need a multiplexing client
like libcurl-multi in order to test QUIC's stream multiplexing. and
because the QUIC library depends on BoringSSL, this HTTP/3-enabled
version of curl can't be linked against rgw (which requires OpenSSL)
for RGWHTTPManager
On Thu, Oct 28, 2021 at 12:24 PM Casey Bodley <cbodley(a)redhat.com> wrote:
>
> On Thu, Oct 28, 2021 at 10:41 AM Yuval Lifshitz <ylifshit(a)redhat.com> wrote:
> >
> > Hi Casey,
> > When it comes to "dechnical debt", the main question is what is the ongoing cost of not making this change?
> > Do we see memory allocation and copy into RGWHTTPArgs as noticeable perf issue? Maybe there is a simpler way to resolve this specific issue?
>
> historically, we have seen very bad behavior from tcmalloc at high
> thread counts in rgw, and we've been making general efforts both to
> reduce allocations and the number of threads required. i don't think
> anyone has tried to measure the impact of RGWHTTPArgs itself, but i do
> see it's use of map<string, string> as low hanging fruit. and because
> this piece is on rgw's http server side, replacing this map wouldn't
> require any of the client stuff described above
>
> > It looks like the list of things to do to achieve feature parity with libcurl is substantial.
>
> i agree! i wanted to start by documenting where the gaps are, to help
> us understand the scope of a project here
>
> even without dropping libcurl, i think there's a lot of potential
> cleanup in the several layers (rgw_http_client, rgw_rest_client,
> rgw_rest_conn, rgw_cr_rest) between libcurl and multisite. for
> multisite in general, i would really like to see it adopt similar
> async primitives to the rest of the rgw codebase so that we can share
> more code
>
> > Is there a desire by the beast maintainers to add these capabilities?
>
> beast has generally positioned itself as a low-level http protocol
> library, to serve as building blocks for higher-level client and
> server libraries/applications. the http ecosystem is vast, so it makes
> sense to limit the scope of any individual library. libcurl is
> enormous, yet still only covers the client side
>
> though with the addition of the tcp_stream in boost 1.70
> (https://www.boost.org/doc/libs/1_70_0/libs/beast/doc/html/beast/release_not…),
> beast did take a step toward this higher level of abstraction. it's
> definitely worth discussing whether additional features like client
> connection pooling would be in scope for the project. it's also worth
> researching what other asio-compatible http client libraries are out
> there
>
>
> > Yuval
> >
> >
> > On Tue, Oct 26, 2021 at 9:34 PM Casey Bodley <cbodley(a)redhat.com> wrote:
> >>
> >> dear Adam and list,
> >>
> >> aside from rgw's frontend, which is the server side of http, we also
> >> have plenty of http client code that sends http requests to other
> >> servers. the biggest user of the client is multisite sync, which uses
> >> http to read replication logs and fetch objects from other zones. all
> >> of this http client code is based on libcurl, and uses its 'multi api'
> >> to provide an async interface with a background thread that polls for
> >> completions
> >>
> >> it's hard to beat libcurl for stability and features, but there has
> >> also been interest in using asio+beast for the client ever since we
> >> added it to the frontend. benefits there would include a nicer c++
> >> interface, better integration with the asio async model (we do
> >> currently have wrappers for libcurl, but they're specific to
> >> coroutines), and the potential to use custom allocators to avoid most
> >> of the per-request allocations
> >>
> >>
> >> to help with a comparison against beast, these are the features of
> >> libcurl that we rely on:
> >>
> >> - asynchronous using the 'multi api' and a background thread
> >> (https://everything.curl.dev/libcurl/drive/multi)
> >> - connection pooling (see https://everything.curl.dev/libcurl/connectionreuse)
> >> - ssl context and optional certificate verification
> >> - connect/request timeouts
> >> - rate limits
> >>
> >> see RGWHTTPClient::init_request() in rgw_http_client.cc for all of the
> >> specific CURLOPT_ features we're using now
> >>
> >> also noteworthy is curl's support for http/1.1, http/2, and http/3
> >> (https://everything.curl.dev/libcurl-http/versions)
> >>
> >>
> >> asio does not have connection pooling or connect timeouts (though it
> >> has the components necessary to build them), and beast only supports
> >> http/1.1. i think everything else in the list is covered:
> >>
> >> ssl support comes from boost::asio::ssl and ssl_stream
> >>
> >> there's a tcp_stream class
> >> (https://www.boost.org/doc/libs/1_70_0/libs/beast/doc/html/beast/ref/boost__…)
> >> that wraps a tcp socket and adds rate limiting and timeouts. we use
> >> that in the frontend, though we're tracking a performance regression
> >> related to its timeouts in https://tracker.ceph.com/issues/52333
> >>
> >> there's a very nice http::fields class
> >> (https://www.boost.org/doc/libs/1_70_0/libs/beast/doc/html/beast/ref/boost__…)
> >> for headers that has custom allocator support. there's an
> >> 'http_server_fast' example at
> >> https://www.boost.org/doc/libs/1_70_0/libs/beast/example/http/server/fast/h…
> >> that uses the custom allocator in
> >> https://www.boost.org/doc/libs/1_70_0/libs/beast/example/http/server/fast/f….
> >> i'd love to see something like that replace our use of map<string,
> >> string> for headers in RGWHTTPArgs during request processing
> >>
> >>
> >> for connection pooling with asio, i did explore this for a while with
> >> Abhishek in https://github.com/cbodley/nexus/tree/wip-connection-pool/include/nexus/htt….
> >> it had connect timeouts and some test coverage in
> >> https://github.com/cbodley/nexus/blob/wip-connection-pool/test/http/test_co…,
> >> but needs more work. for example, each connection_pool is constructed
> >> with one hostname/port. there also needs to be a map of these pools,
> >> keyed either on hostname/port or resolved address, so we can cache
> >> connections for any url the client requests
> >>
> >> i was also imagining higher-level interfaces like http::async_get()
> >> (and head/put/post/etc) that would hide the use of connection pooling
> >> entirely, and use beast's request/response concepts to write the
> >> request and read its response. this is also a good place to implement
> >> retries. i explored this idea in a separate repo here
> >> https://github.com/cbodley/requests/tree/master/include/requests
> >>
> >> with asio, we can attach a connection pooling service as an
> >> io_context::service that gets created automatically on first use, and
> >> saved over the lifetime of the io_context. the application would have
> >> the option to configure it, but doesn't have to know anything about it
> >> otherwise
> >>
> >> overloading those high-level interfaces could also provide a good
> >> abstraction to support http 2 and 3, where their connection pools
> >> would just have one connection per address, and each request would
> >> open its own stream
> >>
> >> _______________________________________________
> >> Dev mailing list -- dev(a)ceph.io
> >> To unsubscribe send an email to dev-leave(a)ceph.io
> >>
Hi Ceph developers,
The March CDM is coming up this Wednesday, March 1st @ 2:00 UTC. See more
meeting details below.
Please add any topics you'd like to discuss to the agenda:
https://tracker.ceph.com/projects/ceph/wiki/CDM_01-MAR-2023
Meeting link:
https://bluejeans.com/908675367
Time conversions:
UTC: Wednesday, January 4, 2:00 UTC
Mountain View, CA, US: Tuesday, January 3, 18:00 PST
Phoenix, AZ, US: Tuesday, January 3, 19:00 MST
Denver, CO, US: Tuesday, January 3, 19:00 MST
Huntsville, AL, US: Tuesday, January 3, 20:00 CST
Raleigh, NC, US: Tuesday, January 3, 21:00 EST
London, England: Wednesday, January 4, 2:00 GMT
Paris, France: Wednesday, January 4, 3:00 CET
Helsinki, Finland: Wednesday, January 4, 4:00 EET
Tel Aviv, Israel: Wednesday, January 4, 4:00 IST
Pune, India: Wednesday, January 4, 7:30 IST
Brisbane, Australia: Wednesday, January 4, 12:00 AEST
Singapore, Asia: Wednesday, January 4, 10:00 +08
Auckland, New Zealand: Wednesday, January 4, 15:00 NZDT
--
Laura Flores
She/Her/Hers
Software Engineer, Ceph Storage <https://ceph.io>
Chicago, IL
lflores(a)ibm.com | lflores(a)redhat.com <lflores(a)redhat.com>
M: +17087388804
On Sun, Feb 26, 2023 at 2:15 PM Patrick Schlangen <patrick(a)schlangen.me> wrote:
>
> Hi Ilya,
>
> > Am 26.02.2023 um 14:05 schrieb Ilya Dryomov <idryomov(a)gmail.com>:
> >
> > Isn't OpenSSL 1.0 long out of support? I'm not sure if extending
> > librados API to support a workaround for something that went EOL over
> > three years ago is worth it.
>
> fair point. However, as long as ceph still supports compiling against OpenSSL 1.0 and has special code paths to initialize OpenSSL for versions <= 1.0, I think this should be fixed. The other option would be to remove OpenSSL 1.0 support completely.
>
> What do you think?
Removing OpenSSL 1.0 support is fine with me but it would need a wider
discussion. I'm CCing the development list.
Thanks,
Ilya
Hi Cephers,
These are the minutes of today's meeting (quicker than usual since some CLT
members were at Ceph Days NYC):
- *[Yuri] Upcoming Releases:*
- Pending PRs for Quincy
- Sepia Lab still absorbing the PR queue after the past issues
- [Ernesto] Github started sending dependabot alerts to devels
(previously it was only sent to org admins)
-
https://github.blog/2023-01-17-dependabot-alerts-are-now-visible-to-more-de…
- Most don't necessarily involve a risk (e.g.: Javascript dependency
only exploitable in a back-end/node.js server)...
- ... but it might still cause some unnecessary concern among devs/users
regarding Ceph security status
- Current list of vulnerable dependencies:
https://github.com/ceph/ceph/security/dependabot
- 40% are Dashboard Javascript ones (most could be dismissed since only
impact when used on node.js apps)
- Remaining ones are:
- Python: requirements.txt (not relevant since Python package versions
change with every distro and we assume distro-maintainers will fix those)
- It might become more relevant when we start packaging Python deps (
https://github.com/ceph/ceph/pull/47501/)
- Golang: "/examples/rgw" path (Casey opened
https://tracker.ceph.com/issues/58828, but maybe we should just dismiss
the alert?)
- [Ernesto] Enabling Github Auto-merge feature in the Ceph repo
-
https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/i…
- Use case:
- There's a PR with approvals but flaky CI tests (API, make check, ...)
(example: https://github.com/ceph/ceph/pull/50201)
- We could retrigger tests and come back to the PR page multiple times
until all tests pass...
- ... Or we just click the "Auto-merge" button, fill out the merge
message as usual, and let Github merge it when the CI tests pass.
- It'd reduce cognitive load, especially with small PRs (docs, backport
PRs) where the overhead of the PR process is more noticeable.
- There's still one issue:
- Keeping Redmine in sync with Github
- It could be done: when clicking the Auto-merge or still requiring
reviewers to poll the PR until passed and then updating Redmine (not ideal)
- A Github action that update a tracker when Github merges the PR would
be very useful
- Yuri/Ilya: discussion around backport requirement reverse order
(needs-qa label vs. approvals vs. CI tests passing).
- Greg pointed out the risks of auto-merge merging PRs with patches
submitted after passing requirements or approvals. Auto-merge status should
be reset on new commits.
- Decision: not to enable it.
- Yuri suggested auto-labeling PRs with passing CI, so they better know
when to start QA testing.
- Separate discussion on CI flakiness & stability and lack of clear
points of contact (Kefu and David did that). For unit tests it's clear that
affected teams should do that, but for infrastructure issues there's still
a vacuum.
Kind Regards,
Ernesto
there are some upcoming changes to the rgw qa suite [1] and its
accompanying s3-tests [2] and ragweed repos [3] that, once merged,
will cause earlier ceph-ci builds to fail the rgw suite
this just means that ceph-ci branches and suite-branches will need to
rebase on main after these merges, so plan accordingly. we don't
usually announce these changes, but with the reef freeze on the
horizon i don't want to delay anyone's testing
[1] https://github.com/ceph/ceph/pull/49950
[2] https://github.com/ceph/s3-tests/pull/487
[3] https://github.com/ceph/ragweed/pull/26
On Mon, Feb 13, 2023 at 8:48 PM Feng, Hualong <hualong.feng(a)intel.com> wrote:
>
>
>
> > -----Original Message-----
> > From: Casey Bodley <cbodley(a)redhat.com>
> > Sent: Wednesday, October 12, 2022 11:11 PM
> > To: Feng, Hualong <hualong.feng(a)intel.com>
> > Cc: Mark Kogan <mkogan(a)redhat.com>; Tang, Guifeng
> > <guifeng.tang(a)intel.com>; Ma, Jianpeng <jianpeng.ma(a)intel.com>;
> > dev(a)ceph.io
> > Subject: Re: RGW encrypt is implemented by qat batch and queue mode
> >
> > On Thu, Sep 22, 2022 at 9:31 PM Feng, Hualong <hualong.feng(a)intel.com>
> > wrote:
> > >
> > >
> > >
> > > > -----Original Message-----
> > > > From: Casey Bodley <cbodley(a)redhat.com>
> > > > Sent: Wednesday, September 21, 2022 10:20 PM
> > > > To: Feng, Hualong <hualong.feng(a)intel.com>
> > > > Cc: Mark Kogan <mkogan(a)redhat.com>; Tang, Guifeng
> > > > <guifeng.tang(a)intel.com>; Ma, Jianpeng <jianpeng.ma(a)intel.com>;
> > > > dev(a)ceph.io
> > > > Subject: Re: RGW encrypt is implemented by qat batch and queue mode
> > > >
> > > > On Mon, Sep 19, 2022 at 4:06 AM Feng, Hualong
> > > > <hualong.feng(a)intel.com>
> > > > wrote:
> > > > >
> > > > > Hi Mark, Casey
> > > > >
> > > > >
> > > > >
> > > > > Could you spare some time to help review these two PRs or add them
> > > > > to
> > > > your plan?
> > > > >
> > > > >
> > > > >
> > > > > The PR link is below:
> > > > >
> > > > > https://github.com/ceph/ceph/pull/47040
> > > > >
> > > > > https://github.com/ceph/ceph/pull/47845
> > > > >
> > > > >
> > > > >
> > > > > I reimplemented the qat encryption plugin. Since the existing RGW
> > > > encryption uses 4KB as an encryption unit, the performance is poor
> > > > when the qat batch interface is not used. Now I have reimplemented
> > > > the encryption plug-in using the qat batch interface, which is done
> > > > in two PRs. PR47040 is used to realize that when the encrypted data
> > > > block is larger than 128KB, 32 pieces of 4K data are taken out for a
> > > > batch submission each time. PR47845 is based on PR47040, each time
> > > > the encrypted data block is smaller than 128KB, it is put into a
> > > > buffer queue first, and when 32 pieces of 4K data or timeout can be
> > reached, a batch submission is performed.
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > The performance result is below, and moreover, the higher the CPU
> > > > > usage,
> > > > the more obvious the effect of qat.
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > From the flame graph, the proportion of the encryption plug-in
> > > > implemented by qat in the RGWPutObj::execute function is lower than
> > > > that of the encryption plug-in implemented by isal.
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > Thanks
> > > > >
> > > > > -Hualong
> > > >
> > > > hey Hualong et al, (cc dev list)
> > > >
> > > > thanks for reaching out, this really helps me understand what those
> > > > PRs are trying to accomplish
> > > >
> > > > in general i'm concerned about the need for threads, locking, and
> > > > buffering down in the crypto plugins. ideally this stuff would be
> > > > under the application's control. in radosgw, we've been trying to
> > > > eliminate any blocking waits on condition variables in our i/o path
> > > > now that requests are handled in coroutines - instead of blocking an
> > > > entire thread, we just suspend the coroutine and run others in the
> > > > meantime
> > >
> > > I agree with your view, but now crypto function calls are still using the
> > synchronous interface. If we don't want the plugin to contain condition
> > variables, we need to implement the plugin in an asynchronous way and
> > provide an asynchronous interface. This requires the RGW to call the
> > interface to make changes.
> > >
> > > And the number of QAT instances is difficult to keep consistent with the
> > number of threads. The number of QAT instance(hardware resources) is
> > limited. When the number of instances is less than the number of threads, we
> > still need to wait for the free instance. If the asynchronous interface is used,
> > we can use the queue as a buffer to avoid blocking the current thread while
> > waiting for a free instance.
> > >
> > > If it is still a synchronous interface, there is no good way to eliminate the
> > condition variable. Do you have a better suggestion here?
> >
> > below you suggest that we could fall back to CPU processing for small object
> > uploads. could we use that same fallback in the cases where we'd otherwise
> > have to block waiting for a QAT instance?
> >
> > >
> > > > seeing that graph by object size, my first impression was that
> > > > radosgw should be using bigger buffers.
> > >
> > > Use a bigger buffer? You mean we should change the encrypted
> > CHUNK_SIZE, from the current 4096B, to a bigger one?
> > > Or other buffers?
> >
> > sorry not the CHUNK_SIZE, but the total amount of data we can feed into QAT
> > at a time. i see in https://github.com/ceph/ceph/pull/47040
> > that you've found the loop in AES_256_CBC::cbc_transform() which breaks
> > the input into CHUNK_SIZEd pieces, and you converted that loop into a single
> > batch() call - that part looks great
> >
> > if each call to cbc_transform() is getting large enough buffers, it could acquire
> > exclusive access to one QAT instance, feed all of its data through it, then
> > release the instance back to a pool. for a large object workload it seems like
> > this strategy would best utilize the hardware resources, because you never
> > have to coordinate a single batch across several threads. you just need to
> > acquire/release access to a QAT instance every 4MB, which you can use for
> > 32 batches (assuming batch size is 32*4k=128k?) at a time
> >
> > >
> > > > GetObj and PutObj are both reading data in 4MB chunks, maybe we can
> > > > find a way to use the qat batch interfaces within those chunks?
> > >
> > > Yes, they are both reading data in 4MB.
> > > But when the object we put is larger than 4MB, the data block size when
> > calling the encryption function is not necessarily 4MB.
> > >
> > > Such as the below that put an object, but the data block size that the
> > > encryption function using is 64KB PUT /examplebucket/chunkObject.txt
> > >
> > > content-encoding:aws-chunked
> > > content-length:66824
> > > host:s3.amazonaws.com
> > > x-amz-content-sha256:STREAMING-AWS4-HMAC-SHA256-PAYLOAD
> > > x-amz-date:20130524T000000Z
> > > x-amz-decoded-content-length:66560
> > > x-amz-storage-class:REDUCED_REDUNDANCY
> > > Authorization:AWS4-HMAC-SHA256
> > >
> > Credential=AKIAIOSFODNN7EXAMPLE/20130524/us-east-1/s3/aws4_request
> > ,Sig
> > >
> > nedHeaders=content-encoding;content-length;host;x-amz-content-sha256;x
> > > -amz-date;x-amz-decoded-content-length;x-amz-storage-class,Signature=4
> > >
> > f232c4386841ef735655705268965c44a0e4690baa4adea153f7db9fa80a0a9
> > > ---------------
> > >
> > 10000;chunk-signature=ad80c730a21e5b8d04586a2213dd63b9a0e99e0e230
> > 7b0ad
> > > e35a65485a288648
> > > <65536-bytes>
> > > ---------------
> > >
> > 10000;chunk-signature=ad80c730a21e5b8d04586a2213dd63b9a0e99e0e230
> > 7b0ad
> > > e35a65485a288648
> > > <65536-bytes>
> >
> > are you sure that this http-level chunking has an effect on the buffer sizes that
> > encryption sees? it may cause the buffers to be segmented at 64k, but
> > encryption and decryption both call bufferlist::c_str() to reallocate a single
> > contiguous buffer:
> > https://github.com/ceph/ceph/blob/9aa8bed/src/rgw/rgw_crypt.cc#L490
> >
> > so i'd still expect this loop in RGWPutObj::execute() to read up to
> > rgw_max_chunk_size at a time:
> > https://github.com/ceph/ceph/blob/fc01eeb7/src/rgw/rgw_op.cc#L4111-L41
> > 41
> >
> > if there are cases where the RGWPutObj_BlockEncrypt filter isn't getting large
> > enough buffers, we can use the same strategy as
> > https://github.com/ceph/ceph/pull/21479, where we improved compression
> > ratios by adding a buffering filter in front
> >
> > >
> > > > that could avoid the need for cross-thread queues and
> > > > synchronization. compared to your approach in
> > > > https://github.com/ceph/ceph/pull/47845, i imagine this would show
> > > > less of a benefit for small object uploads, but more of a benefit
> > > > for the big ones. do you think this could work?
> > >
> > > If in order to avoid the need for cross-thread queues and show less of a
> > benefit for small object uploads, we can turn small objects to CPU processing.
> > Only for big object, we use QAT batch api.
> > >
> > > Hi Casey
> > >
> > > Thanks for your reply. The detail message and some question are above.
> > >
> > > Thanks
> > > -Hualong
> > >
> >
> > all of my feedback here relates to large objects, though you've really focused
> > on small objects in https://github.com/ceph/ceph/pull/47845.
> > for small object workloads, i do agree that the queuing and thread
> > synchronization is necessary to take advantage of this batching
> >
> > it's just hard for me to tell whether that extra complexity is worth it. we've
> > tried to minimize any synchronization between rgw's requests so that we're
> > able to scale to thousands of concurrent requests/connections. at scale, i'd
> > worry that lock contention here would negate some of the gains from QAT
> >
> > in workloads with a mix of small and large objects, i think we'd make the best
> > use of QAT if we applied it to the larger objects (>= 128k?) where we can use
> > it most efficiently
>
> Hi Casey
>
> About the PR https://github.com/ceph/ceph/pull/47040. I have changed the code to be coroutine,
> so it is able to scale without lock contention. And I restrict the use of qat only when object>64K so that
> we can use it most efficiently.
>
> In the QAT part code, I use two async_* interface: one used to get instance, another one used to submit perform request.
> And in rgw code, in order to get `yield` parameter in crypto plugin, I add extra parameter in all process function.
>
> Can you help to review, is the coroutine mode I changed feasible?
thanks Hualong,
the coroutine changes are nicely done; however i still have concerns
about the overall design:
1. these crypto plugins are meant to be common to all ceph components.
rgw may be the only user now, but this reliance on a coroutine-based
runtime could make the plugin unusable elsewhere
the `optional_yield` wrapper (which may or may not contain a real
coroutine yield_context) can potentially make this more general,
if-and-only-if the plugin has a synchronous implementation as a
fallback. currently, the plugin interfaces take an optional_yield, but
QccCrypto::perform_op_batch() calls y.get_yield_context() on it
unconditionally. even within rgw, there may not be a real
yield_context - rgw_beast_enable_async may be false, or the object
write maybe be driven synchronously by an admin command like
`radosgw-admin obj rewrite`
2. even with coroutines, rgw requests may still have to wait for a qat
instance. with a limited number of instances, couldn't this itself
become a bottleneck as we scale up the number of concurrent requests?
earlier in the thread, we discussed falling back to the cpu
implementation if there wasn't a qat instance available. that could
avoid the need for waits, synchronous or otherwise, inside of the
plugin. this would let us take advantage of hardware acceleration when
we can without introducing any new contention. do you see any
drawbacks to this approach?
i hate to keep sending you back to the drawing board; would it help to
discuss this in person? the Performance Weekly call
(https://pad.ceph.com/p/performance_weekly) could be a good place for
that. if that isn't a good time, we might schedule a separate call or
wait until March 1st for the Ceph Developer Monthly (APAC)
>
> And about the PR https://github.com/ceph/ceph/pull/47845, I do agree your view that the extra complexity isn't
> worth it. I will close this PR.
>
> Thanks
> -Hualong