June 2021 - ceph-users - lists.ceph.io

by Samy Ascha

Hi! I have a problem after starting to upgrade to 16.2.4, from 15.2.13. I started the upgrade and it successfully redeployed 2 out of 3 mgr daemon containers. The third failed to upgrade and Cephadm started retrying to upgrade it forever. The only way I could stop this was to disable the cephadm module. I found out I had an old version of podman installed and proceeded to upgrade it to one of the fitting versions according to the requirements docs. I have 3.0.1 installed now. This solved some issue with being unable to start containers, due to a failing 'get podman version' command. (The Go template did not fit the output of the older version of podman.) Ok, so now it got a little further in the process, but enabling the cephadm module would still start to retry the above action indefinitely. It now fails with this log: https://pastebin.com/p3T1fbjs <https://pastebin.com/p3T1fbjs> At first I thought it had something to do with rate limits on docker.io <http://docker.io/>, but it seems I can pull other stuff without problems. I also setup an account and played around with cephadm registry-login, but did not get much further. When looking at the pull command in the logs, I see it is using some ID for the container image that needs to be resolved, I suppose. Could it maybe make an error here, resulting in a bad URL that hits a resource that it is not supposed to hit, resulting in access errors? Any other thoughts on how to fix this error, or somehow make cephadm stop retrying this action and fixing it? Thanks very much and with regards, Samy

2 years, 11 months

1
0
0 0

Ceph monitor won't start after Ubuntu update

by Petr

Hello Ceph-users, I've upgraded my Ubuntu server from 18.04.5 LTS to Ubuntu 20.04.2 LTS via 'do-release-upgrade', during that process ceph packages were upgraded from Luminous to Octopus and now ceph-mon daemon(I have only one) won't start, log error is: "2021-06-15T20:23:41.843+0000 7fbb55e9b540 -1 mon.target@-1(probing) e2 current monmap has recorded min_mon_release 12 (luminous) is >2 releases older than installed 15 (octopus); you can only upgrade 2 releases at a time you should first upgrade to 13 (mimic) or 14 (nautilus) stopping." Is there any way to get cluster running or at least get data from OSDs? Will appreciate any help. Thank you -- Best regards, Petr

2 years, 11 months

2
3
0 0

Likely date for Pacific backport for RGW fix?

by Chris Palmer

Hi Our first upgrade (non-cephadm) from Octopus to Pacific 16.0.4 went very smoothly. Thanks for all the effort. The only thing that has bitten us is https://tracker.ceph.com/issues/50556 <https://tracker.ceph.com/issues/50556> which prevents a multipart upload to an RGW bucket that has a bucket policy. While I've been able to rewrite the most urgent scripts to use s3api put-object (which doesn't do multipart), that only works for objects up to a certain size. Removing the bucket policies isn't an option. I can see that it has been fixed, and is now pending backport (https://tracker.ceph.com/issues/51001 <https://tracker.ceph.com/issues/51001>). Will this be included in 16.0.5? And do we have an estimated date for that? We can wait a little longer, but otherwise I have to do some more drastic changes to an application. Having an indication of date would help me choose which... Many thanks, Chris

2 years, 11 months

2
1
0 0

Re: JSON output schema

by Vladimir Prokofev

This is a great start, thank you! Basically I can look through the code to get the keys I need. But maybe I'm approaching this task wrong? Maybe there's already some better solution to monitor cluster health details? ср, 16 июн. 2021 г. в 02:47, Anthony D'Atri <anthony.datri(a)gmail.com>: > Before Luminous, mon clock skew was part of the health status JSON. With > Luminous and later releases, one has to invoke a separate command to get > the info. > > This is a royal PITA for monitoring / metrics infrastructure and I’ve > never seen a reason why it was done. > > You might find the code here > https://github.com/digitalocean/ceph_exporter > useful. Note that there are multiple branches, which can be confusing. > > > On Jun 15, 2021, at 4:21 PM, Vladimir Prokofev <v(a)prokofev.me> wrote: > > > > Good day. > > > > I'm writing some code for parsing output data for monitoring purposes. > > The data is that of "ceph status -f json", "ceph df -f json", "ceph osd > > perf -f json" and "ceph osd pool stats -f json". > > I also need support for all major CEPH releases, starting with Jewel till > > Pacific. > > > > What I've stumbled upon is that: > > - keys in JSON output are not present if there's no appropriate data. > > For example the key ['pgmap', 'read_bytes_sec'] will not be present in > > "ceph status" output if there's no read activity in the cluster; > > - some keys changed between versions. For example ['health']['status'] > key > > is not present in Jewel, but is available in all the following versions; > > vice-versa, key ['osdmap', 'osdmap'] is not present in Pacific, but is in > > all the previous versions. > > > > So I need to get a list of all possible keys for all CEPH releases. Any > > ideas how this can be achieved? My only thought atm is to build a > "failing" > > cluster with all the possible states and get a reference data out of it. > > Not only this is tedious work since it requires each possible cluster > > version, but it is also prone for error. > > Is there any publicly available JSON schema for output? > > _______________________________________________ > > ceph-users mailing list -- ceph-users(a)ceph.io > > To unsubscribe send an email to ceph-users-leave(a)ceph.io > >

2 years, 11 months

1
0
0 0

osd_scrub_max_preemptions for large OSDs or large EC pgs

by Dave Hall

Hello, I would like to ask about osd_scrub_max_preemptions in 14.2.20 for large OSDs (mine are 12TB) and/or large k+m EC pools (mine are 8+2). I did search the archives for this list, but I did not see any reference. Symptoms: I have been seeing a behavior in my cluster over the past 2 or 3 weeks where, for no apparent reason, there are suddenly slow ops, followed by a brief OSD down, massive but brief degradation/activating/peering, and then back to normal. I had thought this might have to do with some backfill activity due to a recently failed (as in down and out and process wouldn't start), but now all of that is over and the cluster is mostly back to HEALTH_OK. Thinking this might be something that was introduced between 14.2.9 and 14.2.16, I upgraded to 14.2.20 this morning. However, I just saw the same kind of event happen twice again. At the time, the only non-client activity was a single deep-scrub. Question: The description for osd_scrub_max_preemptions indicates that a deep scrub process will allow itself to be preempted a fixed number of times by client I/O and will then block client I/O until it finishes. Although I don't fully understand the deep scrub process, it seems that either the size of the HDD or the k+m count of the EC Pool could affect the time needed to complete a deep scrub and thus increase the likelihood that more than the default 5 preemptions will occur. Please tell me if my understanding is correct. If so, is there any guideline for increasing osd_scrub_max_preemptions just enough balance between scrub progress and client responsiveness? Or perhaps there are other scrub attributes that should be tuned instead? Thanks. -Dave -- Dave Hall Binghamton University kdhall(a)binghamton.edu

2 years, 11 months

1
0
0 0

Strange (incorrect?) upmap entries in OSD map

by Andras Pataki

I've been working on some improvements to our large cluster's space balancing, when I noticed that sometimes the OSD maps have strange upmap entries. Here is an example on a clean cluster (PGs are active+clean): { "pgid": "1.1cb7", ... "up": [ 891, 170, 1338 ], "acting": [ 891, 170, 1338 ], ... }, with an upmap entry: pg_upmap_items 1.1cb7 [170,891] this would make the "up" list [ 170, 170, 1338 ], which isn't allowed. So the cluster just seems to ignore this upmap. When I remove the upmap, nothing changes in the PG state, and I can even re-insert it (without any effect). Any ideas why this upmap doesn't simply get rejected/removed? However, if I were to insert an upmap [170, 892], it gets rejected correctly (since 891 and 892 are on the same host - violating crush rules). Any insights would be helpful, Andras

2 years, 11 months

1
0
0 0

ceph osd df return null

by julien lenseigne

Hi, when i do ceph osd df, some osd returns null size. For example : 0 hdd 7.27699 1.00000 0B 0B 0B 0B 0B 0B 0 0 29 1 hdd 7.27698 1.00000 0B 0B 0B 0B 0B 0B 0 0 23 11 ssd 0.50000 1.00000 0B 0B 0B 0B 0B 0B 0 0 49 2 hdd 7.27699 1.00000 0B 0B 0B 0B 0B 0B 0 0 30 4 hdd 7.27699 1.00000 0B 0B 0B 0B 0B 0B 0 0 37 12 ssd 0.50000 1.00000 0B 0B 0B 0B 0B 0B 0 0 39 7 hdd 24.55698 1.00000 24.6TiB 663GiB 663GiB 0B 0B 23.9TiB 2.64 0.79 302 42 hdd 10.69179 1.00000 10.7TiB 346GiB 344GiB 30.0MiB 1.13GiB 10.4TiB 3.16 0.95 98 43 hdd 10.69179 1.00000 10.7TiB 357GiB 355GiB 49.1MiB 1.17GiB 10.3TiB 3.26 0.98 131 44 hdd 10.69179 1.00000 10.7TiB 298GiB 297GiB 30.4MiB 1.05GiB 10.4TiB 2.72 0.81 92 45 hdd 10.69179 1.00000 10.7TiB 342GiB 341GiB 26.6MiB 1.11GiB 10.4TiB 3.12 0.94 105 40 ssd 0.87270 1.00000 894GiB 180GiB 179GiB 33.5MiB 990MiB 714GiB 20.10 6.02 53 41 ssd 0.87270 1.00000 894GiB 254GiB 253GiB 39.0MiB 1.00GiB 640GiB 28.38 8.50 61 46 ssd 0.87270 1.00000 894GiB 255GiB 254GiB 26.8MiB 1.03GiB 639GiB 28.55 8.55 76 3 hdd 24.55699 1.00000 24.6TiB 537GiB 536GiB 67.9MiB 1.36GiB 24.0TiB 2.14 0.64 252 do you know where it might come from ? Thanks. -- Julien Lenseigne Responsable informatique LMD Tel: 0169335172 Ecole Polytechnique, route de saclay, 91128 Palaiseau Bat 83 - Bureau 83.30.13

2 years, 11 months

2
1
0 0

Drop old SDD / HDD Host crushmap rules

by Denny Fuchs

Hello, i have from the beginning on one DC very old crush map rules, to split HDD and SSD disks. It is obsolete since Luminous and I want to drop them: # ceph osd crush rule ls replicated_rule fc-r02-ssdpool fc-r02-satapool fc-r02-ssd ================= [ { "rule_id": 0, "rule_name": "replicated_rule", "ruleset": 0, "type": 1, "min_size": 1, "max_size": 10, "steps": [ { "op": "take", "item": -1, "item_name": "default" }, { "op": "chooseleaf_firstn", "num": 0, "type": "host" }, { "op": "emit" } ] }, { "rule_id": 1, "rule_name": "fc-r02-ssdpool", "ruleset": 1, "type": 1, "min_size": 1, "max_size": 10, "steps": [ { "op": "take", "item": -15, "item_name": "r02-ssds" }, { "op": "chooseleaf_firstn", "num": 0, "type": "host" }, { "op": "emit" } ] }, { "rule_id": 2, "rule_name": "fc-r02-satapool", "ruleset": 2, "type": 1, "min_size": 1, "max_size": 10, "steps": [ { "op": "take", "item": -16, "item_name": "r02-sata" }, { "op": "chooseleaf_firstn", "num": 0, "type": "host" }, { "op": "emit" } ] }, { "rule_id": 3, "rule_name": "fc-r02-ssd", "ruleset": 3, "type": 1, "min_size": 1, "max_size": 10, "steps": [ { "op": "take", "item": -4, "item_name": "default~ssd" }, { "op": "chooseleaf_firstn", "num": 0, "type": "host" }, { "op": "emit" } ] } ] ========================== # ceph osd tree ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF -14 0 root sata -18 0 datacenter fc-sata -16 0 rack r02-sata -13 18.55234 root ssds -17 18.55234 datacenter fc-ssds -15 18.55234 rack r02-ssds -6 3.09206 host fc-r02-ceph-osd-01 41 nvme 0.36388 osd.41 up 1.00000 1.00000 0 ssd 0.45470 osd.0 up 1.00000 1.00000 1 ssd 0.45470 osd.1 up 1.00000 1.00000 2 ssd 0.45470 osd.2 up 1.00000 1.00000 3 ssd 0.45470 osd.3 up 1.00000 1.00000 4 ssd 0.45470 osd.4 up 1.00000 1.00000 5 ssd 0.45470 osd.5 up 1.00000 1.00000 -2 3.09206 host fc-r02-ceph-osd-02 36 nvme 0.36388 osd.36 up 1.00000 1.00000 6 ssd 0.45470 osd.6 up 1.00000 1.00000 7 ssd 0.45470 osd.7 up 1.00000 1.00000 8 ssd 0.45470 osd.8 up 1.00000 1.00000 9 ssd 0.45470 osd.9 up 1.00000 1.00000 10 ssd 0.45470 osd.10 up 1.00000 1.00000 29 ssd 0.45470 osd.29 up 1.00000 1.00000 -5 3.45593 host fc-r02-ceph-osd-03 38 nvme 0.36388 osd.38 up 1.00000 1.00000 40 nvme 0.36388 osd.40 up 1.00000 1.00000 11 ssd 0.45470 osd.11 up 1.00000 1.00000 12 ssd 0.45470 osd.12 up 1.00000 1.00000 13 ssd 0.45470 osd.13 up 1.00000 1.00000 14 ssd 0.45470 osd.14 up 1.00000 1.00000 15 ssd 0.45470 osd.15 up 1.00000 1.00000 16 ssd 0.45470 osd.16 up 1.00000 1.00000 -9 3.09206 host fc-r02-ceph-osd-04 37 nvme 0.36388 osd.37 up 1.00000 1.00000 30 ssd 0.45470 osd.30 up 1.00000 1.00000 31 ssd 0.45470 osd.31 up 1.00000 1.00000 32 ssd 0.45470 osd.32 up 1.00000 1.00000 33 ssd 0.45470 osd.33 up 1.00000 1.00000 34 ssd 0.45470 osd.34 up 1.00000 1.00000 35 ssd 0.45470 osd.35 up 1.00000 1.00000 -11 2.72818 host fc-r02-ceph-osd-05 17 ssd 0.45470 osd.17 up 1.00000 1.00000 18 ssd 0.45470 osd.18 up 1.00000 1.00000 19 ssd 0.45470 osd.19 up 1.00000 1.00000 20 ssd 0.45470 osd.20 up 1.00000 1.00000 21 ssd 0.45470 osd.21 up 1.00000 1.00000 22 ssd 0.45470 osd.22 up 1.00000 1.00000 -25 3.09206 host fc-r02-ceph-osd-06 39 nvme 0.36388 osd.39 up 1.00000 1.00000 23 ssd 0.45470 osd.23 up 1.00000 1.00000 24 ssd 0.45470 osd.24 up 1.00000 1.00000 25 ssd 0.45470 osd.25 up 1.00000 1.00000 26 ssd 0.45470 osd.26 up 1.00000 1.00000 27 ssd 0.45470 osd.27 up 1.00000 1.00000 28 ssd 0.45470 osd.28 up 1.00000 1.00000 -1 0 root default =================================================== the question is now: what is the best way, to drop them and use the defaults from Ceph. Any suggestions ? cu denny

2 years, 11 months

1
0
0 0

RADOSGW Keystone integration - S3 bucket policies targeting not just other tenants / projects ?

by Christian Rohmann

Hallo Ceph-Users, I've been wondering about the state of OpenStack Keystone Auth in RADOSGW. 1) Even though the general documentation on RADOSGW S3 bucket policies is a little "misleading" https://docs.ceph.com/en/latest/radosgw/bucketpolicy/#creation-and-removal in showing users being referred as Principal, the documentation about Keystone integration at https://docs.ceph.com/en/latest/radosgw/keystone/#integrating-with-openstac… clearly states, that "A Ceph Object Gateway user is mapped into a Keystone <tenant>"||. In the keystone authentication code it strictly only takes the project from the authenticating user: * https://github.com/ceph/ceph/blob/6ce6874bae8fbac8921f0bdfc3931371fc61d4ff/… * https://github.com/ceph/ceph/blob/6ce6874bae8fbac8921f0bdfc3931371fc61d4ff/… This is rather unfortunate as this renders the usually powerful S3 bucket policies to be rather basic with granting access to all users (with a certain role) of a project or more importantly all users of another project / tenant, as in using arn:aws:iam::$OS_REMOTE_PROJECT_ID:root as principal. Or am I just misreading anything here or is this really all that can be done if using native keystone auth? 2) There is a PR open implementing generic external authentication https://github.com/ceph/ceph/pull/34093 Apparently this seems to also address the lack of support for subusers for Keystone - if I understand this correctly I could then grant access to users arn:aws:iam::$OS_REMOTE_PROJECT_ID:$user Are there any plans on the roadmap to extend the functionality in regards to keystone as authentication backend? I know a similar question as been asked before (https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/GY7VUKCQ5QU…) but unfortunately with no discussion / responses then. Regards Christian

2 years, 11 months

1
0
0 0

libceph: monX session lost, hunting for new mon

by Magnus HAGDORN

Hi all, I know this came up before but I couldn't find a resolution. We get the error libceph: monX session lost, hunting for new mon a lot on our samba servers that reexport cephfs. A lot means more than once a minute. On other machines that are less busy we get it about every 10-30 minutes. We only use a single network for both client and backend traffic on bonded 10GE links. So, my questions are: is this expected and normal behaviour? how to track this problem down? Regards magnus The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. Is e buidheann carthannais a th’ ann an Oilthigh Dhùn Èideann, clàraichte an Alba, àireamh clàraidh SC005336.

2 years, 11 months

1
0
0 0

2024

2023

2022

2021

2020

2019

ceph-users June 2021