Hello everyone,
I m implementing a jaeger tracing system in RGW, some images below
[image: RGW_DELETE_OBJ.png]For deleting a object
[image: RGW_LIST_BUCKETS.png]
for getting the list of buckets in clusters
I have two tags in jaegerUI to filter various spans one is gateway(swift or
s3) and another is RGwOpeartion type like for example putting an object has
a name "RGW_OP_PUT_OBJ".
is this much detail sufficient? or should I go more deep into each
functions and try to trace those.
Also what should any improvement and changes I can make into this to make
this more developer friendly.
Thank you
Hi All,
Sorry for the noise, I'm still not receiving mail from this list (or
from ceph-users for that matter) and needed to do another test.
David (or whoever else is dev-owner), if it's not too much trouble can
you please send me a snippet from the mailman/postfix log for when
*this* message is delivered back to me?
Thanks a lot in advance, and again, sorry for the noise.
Regards,
Tim
--
Tim Serong
Senior Clustering Engineer
SUSE
tserong(a)suse.com
Running Nautilus 14.2.9 and trying to follow the STS example given here:
https://docs.ceph.com/docs/master/radosgw/STS/ to setup a policy
for AssumeRoleWithWebIdentity using KeyCloak (8.0.1) as the OIDC provider.
I am able to see in the rgw debug logs that the token being passed from the
client is passing the introspection check, but it always ends up failing
the final authorization to access the requested bucket resource and is
rejected with a 403 status "AccessDenied".
I configured my policy as described in the 2nd example on the STS page
above. I suspect the problem is with the "StringEquals" condition statement
in the AssumeRolePolicy document (I could be wrong though).
The example shows using the keycloak URI followed by ":app_id" matching
with the name of the keycloak client application ("customer-portal" in the
example). My keycloak setup does not have any such field in the
introspection result and I can't seem to figure out how to make this all
work.
I cranked up the logging to 20/20 and still did not see any hints as to
what part of the policy is causing the access to be denied.
Any suggestions?
-Wyllys Ingersoll
(Please use reply-all so that people I've explicitly tagged and I can continue to get direct email replies)
I have a conundrum around developing best practices for Rook/Ceph clusters around OSD memory targets and hardware recommendations for OSDs. I want to lay down some bulleted notes first.
- 'osd_memory_target' defaults to 4GB
- OSDs attempt to keep memory allocation to 'osd_memory_target'
- BUT... this is only best-effort
- AND... there is no guarantee that the kernel will actually reclaim memory that OSDs release/unmap
- Therefore, we (SUSE) have developed a recommendation that ...
Total OSD node RAM required = (num OSDs) x (1 GB + osd_memory_target) + 16 GB
- In Rook/Kubernetes, Ceph OSDs will read the POD_MEMORY_REQUEST and POD_MEMORY_LIMIT env vars to infer a new default value for 'osd_memory_target'
- POD_MEMORY_REQUEST translates directly to 'osd_memory_target' 1:1
- POD_MEMORY_LIMIT (if REQUEST is unset) will set 'osd_memory_target' using the formula ( LIMIT x osd_memory_target_cgroup_limit_ratio )
- the default 'osd_memory_target' will be = min(REQUEST, LIMIT*ratio)
- Lars has suggested that setting limits is not a best practice for Ceph; when limits are encountered, Ceph is likely in a failure state, and killing daemons could result in a "thundering herds" distributed systems problem
As you can see, there is a self-referential problem here. The OSD hardware recommendation should inform us how to set k8s resource limits for OSDs; however, doing so will affect osd_memory_target, which alters the recommendation, which further alters our k8s resource limit circularly forever.
We can address this issue with a semi-workaround currently:
set osd_memory_target explicitly in Ceph's config, and set an appropriate k8s resource request matching (osd_memory_target + 1GB + some extra) to meet the hardware recommendation. However, means that the Ceph feature of setting osd_memory_target based on resource requests isn't really used because it doesn't behave to actual best practices. And setting a realistic k8s resource request is useful for kubernetes so that it won't schedule more daemons onto a node than the node can realistically support.
Long-term, I wonder if it is good to add into Ceph a computation that [[ osd_memory_target = REQUEST + osd_memory_request_overhead ]] where the osd_memory_requests overhead defaults to 1GB or somewhat higher.
Please discuss, and let me know if anything here seems like I've gotten it wrong or if there are other options I haven't seen.
Cheers, and happy Tuesday!
Blaine
There is a general documentation meeting called the "DocuBetter Meeting",
and it is held every two weeks. The next DocuBetter Meeting will be on 13
May 2020 at 0830 PST, and will run for thirty minutes. Everyone with a
documentation-related request or complaint is invited. The meeting will be
held here: https://bluejeans.com/908675367
Send documentation-related requests and complaints to me by replying to
this email and CCing me at zac.dover(a)gmail.com.
The next DocuBetter meeting is scheduled for:
13 May 2020 0830 PST
13 May 2020 1630 UTC
14 May 2020 0230 AEST
Etherpad: https://pad.ceph.com/p/Ceph_Documentation
Meeting: https://bluejeans.com/908675367
Thanks, everyone.
Zac Dover
The basic premise is for an account to be a container for users, and
also related functionality like roles & groups. This would converge
similar to the AWS concept of an account, where the AWS account can
further create iam users/roles or groups. Every account can have a root
user or user(s) with permissions to administer creation of users and
allot quotas within an account. These can be implemented with a new
account cap. IAM set of apis already have a huge subset of functionality
to summarize accounts and inspect/create users/roles or groups. Every
account would also store the membership of its users/groups and roles,
(similar to user's buckets) though we'd ideally limit to < 10k
users/roles or groups per account.
In order to deal with the currently used tenants which namespace
buckets, but also currently stand in for the account id in the policy
language & ARNs, we'd have a tenant_id attribute in the account, which
if set will prevent cross tenant users being added. Though this is not
enforced when the tenant id isn't set, accounts without this field set
can potentially add users across tenants, so this is one of the cases
where we expect the account owner to know what they are doing.
We'd transition away from <tenant>:<user> in the Policy principal to
<account-id>:<user>, so if users with different tenants are in the same account
we'd expect the user to change their policies to reuse the account ids.
In terms of regular operations IO costs, the user info would have an account id
attribute, and if non empty we'd have to read the Account root user policies and
/or public access configuration, though other attributes like list of users/roles
and groups would only be read for necessary IAM/admin apis.
Quotas
~~~~~~
For quotas we can implement one of the following ways
- a user_bytes/buckets quota, which would be alloted to every user
- a total account quota, in which case it is the responsibility of the account
user to allot a quota upon user creation
Though for operations themselves it is th user quota that comes into play.
APIs
~~~~
- creating an account itself should be available via the admin tooling/apis
- Ideally creation of a root user under an account would still have to be
explicitly, though we could consider adding this to the account creation
process itself to simplify things.
- For further user creation and management, we could start implementing to the
iam set of apis in the future, though currently we already have admin apis for
user creation and the like, and we could allow the user with account caps to
do these operations
Deviations
~~~~~~~~~~
Some apis like list buckets in AWS list all the buckets in the user account and
not the specific iam user, we'd probably still list only the user buckets,
though we could consider this for the account root user.
Wrt to the openstack swift apis, we'd still keep the current user_id -> swift
account id mapping, so no breakage is expected wrt end user apis, so the
account stats and related apis would be similar to the older version where
it is still user's summary that is displayed
Comments on if this is the right direction?
--
Abhishek
Hello everyone,
I have a small doubt,
GET /{api version}/{account} HTTP/1.1
Host: {fqdn}
X-Auth-Token: {auth-token}
this is a swift request to get bucket from cluster what is {api
version} and {auth-token} in this request, account I have as "test".
And should I use curl to make these requests?
Hi all.
I'm getting lots of error getting bucket stats from my rgw instances
and I can't figure out what's going wrong!
Can we add bucket name to this error to see whats happening in the cluster?
https://github.com/ceph/ceph/pull/34653
Thanks.