November 2020 - ceph-users

by John Zachary Dover

There is a general documentation meeting called the "DocuBetter Meeting", and it is held every two weeks. The next DocuBetter Meeting will be on Nov 26, 2020 at 0200 UTC, and will run for thirty minutes. Everyone with a documentation-related request or complaint is invited. The meeting will be held here: https://bluejeans.com/908675367 (Since this particular instance of this meeting will be held on the Wednesday before United States Thanksgiving, I expect a light turnout.) Send documentation-related requests and complaints to me by replying to this email and CCing me at zac.dover(a)gmail.com. The next DocuBetter meeting is scheduled for: 26 Nov 2020 0200 UTC Etherpad: https://pad.ceph.com/p/Ceph_Documentation Meeting: https://bluejeans.com/908675367 Thanks, everyone. Zac Dover

3 years, 5 months

1
0
0 0

uniform and list crush bucket algorithm usage in data centers

by Bobby

Hi all, For placement purposes ceph uses the default straw2 bucket algorithm. I am curious if the other two bucket algorithms like uniform and list are also being used in some present use cases in data centers? Are there any use cases where straw2 is not being used at all ? BR

3 years, 5 months

1
0
0 0

KeyError: 'targets' when adding second gateway on ceph-iscsi - BUG

by Hamidreza Hosseini

Hi, I have installed ceph-iscsi on ubuntu 20 manully, But when I want to add second gateway it will show me error: ``` OS: Ubuntu 20 LTS ceph version : octopus I install cluster with ceph-ansible but I install ceph-iscsi manually via following link: https://docs.ceph.com/en/latest/rbd/iscsi-target-cli/ (instead of 'yum install ceph-iscsi' I commit 'apt install ceph-iscsi') root@dev13:~# gwcli -v gwcli - 2.7 ``` ``` /iscsi-target...-igw/gateways> create ceph-gateway1 192.168.200.33 Adding gateway, sync'ing 0 disk(s) and 0 client(s) KeyError: 'targets' ``` and it will jump out of gwcli! this is my gwcli log: ``` root@dev13:~# cat gwcli.log 2020-11-25 14:24:56,440 DEBUG [ceph.py:32:__init__()] Adding ceph cluster 'ceph' to the UI 2020-11-25 14:24:57,049 DEBUG [ceph.py:241:populate()] Fetching ceph osd information 2020-11-25 14:24:57,086 DEBUG [ceph.py:150:update_state()] Querying ceph for state information 2020-11-25 14:24:57,197 DEBUG [storage.py:105:refresh()] Refreshing disk information from the config object 2020-11-25 14:24:57,197 DEBUG [storage.py:108:refresh()] - Scanning will use 8 scan threads 2020-11-25 14:24:57,254 DEBUG [storage.py:135:refresh()] - rbd image scan complete: 0s 2020-11-25 14:24:57,254 DEBUG [gateway.py:378:refresh()] Refreshing gateway & client information 2020-11-25 14:24:57,254 DEBUG [ceph.py:150:update_state()] Querying ceph for state information 2020-11-25 14:24:57,294 DEBUG [ceph.py:260:refresh()] Gathering pool stats for cluster 'ceph' 2020-11-25 14:25:02,319 DEBUG [ceph.py:32:__init__()] Adding ceph cluster 'ceph' to the UI 2020-11-25 14:25:03,076 DEBUG [ceph.py:241:populate()] Fetching ceph osd information 2020-11-25 14:25:03,168 DEBUG [ceph.py:150:update_state()] Querying ceph for state information 2020-11-25 14:25:03,219 DEBUG [storage.py:105:refresh()] Refreshing disk information from the config object 2020-11-25 14:25:03,219 DEBUG [storage.py:108:refresh()] - Scanning will use 8 scan threads 2020-11-25 14:25:03,273 DEBUG [storage.py:135:refresh()] - rbd image scan complete: 0s 2020-11-25 14:25:03,274 DEBUG [gateway.py:378:refresh()] Refreshing gateway & client information 2020-11-25 14:25:03,274 DEBUG [ceph.py:150:update_state()] Querying ceph for state information 2020-11-25 14:25:03,314 DEBUG [ceph.py:260:refresh()] Gathering pool stats for cluster 'ceph' 2020-11-25 14:25:32,950 DEBUG [ceph.py:32:__init__()] Adding ceph cluster 'ceph' to the UI 2020-11-25 14:25:33,614 DEBUG [ceph.py:241:populate()] Fetching ceph osd information 2020-11-25 14:25:33,652 DEBUG [ceph.py:150:update_state()] Querying ceph for state information 2020-11-25 14:25:33,714 DEBUG [storage.py:105:refresh()] Refreshing disk information from the config object 2020-11-25 14:25:33,714 DEBUG [storage.py:108:refresh()] - Scanning will use 8 scan threads 2020-11-25 14:25:33,811 DEBUG [storage.py:135:refresh()] - rbd image scan complete: 0s 2020-11-25 14:25:33,811 DEBUG [gateway.py:378:refresh()] Refreshing gateway & client information 2020-11-25 14:25:33,811 DEBUG [ceph.py:150:update_state()] Querying ceph for state information 2020-11-25 14:25:33,864 DEBUG [ceph.py:260:refresh()] Gathering pool stats for cluster 'ceph' 2020-11-25 14:26:02,665 DEBUG [gateway.py:174:ui_command_create()] CMD: /iscsi create iqn.2003-01.com.redhat.iscsi-gw:iscsi-igw 2020-11-25 14:26:02,666 DEBUG [gateway.py:185:ui_command_create()] Create an iscsi target definition in the UI 2020-11-25 14:26:03,191 INFO [gateway.py:196:ui_command_create()] ok 2020-11-25 14:26:45,455 DEBUG [gateway.py:793:ui_command_create()] CMD: ../gateways/ create dev13 ['192.168.200.23'] nosync=False skipchecks=false 2020-11-25 14:26:45,467 INFO [gateway.py:836:ui_command_create()] Adding gateway, sync'ing 0 disk(s) and 0 client(s) 2020-11-25 14:26:45,949 DEBUG [gateway.py:854:ui_command_create()] Gateway creation successful 2020-11-25 14:26:45,949 DEBUG [gateway.py:855:ui_command_create()] Adding gw to UI 2020-11-25 14:26:45,968 DEBUG [gateway.py:934:refresh()] - checking iSCSI/API ports on dev13 2020-11-25 14:26:45,979 INFO [gateway.py:874:ui_command_create()] ok 2020-11-25 14:28:41,572 DEBUG [ceph.py:32:__init__()] Adding ceph cluster 'ceph' to the UI 2020-11-25 14:28:42,269 DEBUG [ceph.py:241:populate()] Fetching ceph osd information 2020-11-25 14:28:42,297 DEBUG [ceph.py:150:update_state()] Querying ceph for state information 2020-11-25 14:28:42,334 DEBUG [storage.py:105:refresh()] Refreshing disk information from the config object 2020-11-25 14:28:42,334 DEBUG [storage.py:108:refresh()] - Scanning will use 8 scan threads 2020-11-25 14:28:42,378 DEBUG [storage.py:135:refresh()] - rbd image scan complete: 0s 2020-11-25 14:28:42,378 DEBUG [gateway.py:378:refresh()] Refreshing gateway & client information 2020-11-25 14:28:42,449 DEBUG [gateway.py:934:refresh()] - checking iSCSI/API ports on dev13 2020-11-25 14:28:42,463 DEBUG [ceph.py:150:update_state()] Querying ceph for state information 2020-11-25 14:28:42,517 DEBUG [ceph.py:260:refresh()] Gathering pool stats for cluster 'ceph' 2020-11-25 14:28:52,659 DEBUG [gateway.py:793:ui_command_create()] CMD: ../gateways/ create ceph-gateway1 ['192.168.200.33'] nosync=False skipchecks=false 2020-11-25 14:28:52,672 INFO [gateway.py:836:ui_command_create()] Adding gateway, sync'ing 0 disk(s) and 0 client(s) 2020-11-25 14:28:53,406 DEBUG [gateway.py:854:ui_command_create()] Gateway creation successful 2020-11-25 14:28:53,407 DEBUG [gateway.py:855:ui_command_create()] Adding gw to UI ``` ``` root@dev13:~# sudo gwcli export copy { "created": "2020/11/25 10:54:39", "discovery_auth": { "mutual_password": "", "mutual_password_encryption_enabled": false, "mutual_username": "", "password": "", "password_encryption_enabled": false, "username": "" }, "disks": {}, "epoch": 2, "gateways": { "dev13": { "active_luns": 0, "created": "2020/11/25 10:56:45", "updated": "2020/11/25 10:56:45" } }, "targets": { "iqn.2003-01.com.redhat.iscsi-gw:iscsi-igw": { "acl_enabled": true, "auth": { "mutual_password": "", "mutual_password_encryption_enabled": false, "mutual_username": "", "password": "", "password_encryption_enabled": false, "username": "" }, "clients": {}, "controls": {}, "created": "2020/11/25 10:56:02", "disks": {}, "groups": {}, "ip_list": [ "192.168.200.23" ], "portals": { "dev13": { "gateway_ip_list": [ "192.168.200.23" ], "inactive_portal_ips": [], "portal_ip_addresses": [ "192.168.200.23" ], "tpgs": 1 } }, "updated": "2020/11/25 10:56:45" } }, "updated": "2020/11/25 10:56:45", "version": 11 } ``` I even changed python3 to python2 but I have the same issue but in the proccess of defining iqn. What should I do for this problem?

3 years, 5 months

1
0
0 0

Re: Documentation of older Ceph version not accessible anymore on docs.ceph.com

by John Zachary Dover

I'll carry this message to the Leadership Team this week and, if Thanksgiving proves an impediment to addressing it, next week. Thanks for raising this concern. Zac Dover Upstream Docs Ceph On Wed, Nov 25, 2020 at 1:43 AM Marc Roos <M.Roos(a)f1-outsourcing.eu> wrote: > > > 2nd that. Why even remove old documentation before it is migrated to the > new environment. It should be left online until the migration > successfully completed. > > > > -----Original Message----- > Sent: Tuesday, November 24, 2020 4:23 PM > To: Frank Schilder > Cc: ceph-users > Subject: [ceph-users] Re: Documentation of older Ceph version not > accessible anymore on docs.ceph.com > > I want to just echo this sentiment. I thought the lack of older docs > would be a very temporary issue, but they are still not available. It is > especially frustrating when half the google searches also return a page > not found error. The migration has been very badly done. > > Sincerely, > > On Tue, Nov 24, 2020 at 2:52 AM Frank Schilder <frans(a)dtu.dk> wrote: > > > Older versions are available here: > > > > > > https://web.archive.org/web/20191226012841/https://docs.ceph.com/docs/ > > mimic/ > > > > I'm actually also a bit unhappy about older versions missing. Mimic is > > > not end of life and a lot of people still use luminous. Since there > > are such dramatic differences between interfaces, the old docs should > > not just disappear. > > > > Best regards, > > ================= > > Frank Schilder > > AIT Risø Campus > > Bygning 109, rum S14 > > > > ________________________________________ > > From: Dan Mick <dmick(a)redhat.com> > > Sent: 24 November 2020 01:53:29 > > To: Martin Palma > > Cc: ceph-users > > Subject: [ceph-users] Re: Documentation of older Ceph version not > > accessible anymore on docs.ceph.com > > > > I don't know the answer to that. > > > > On 11/23/2020 6:59 AM, Martin Palma wrote: > > > Hi Dan, > > > > > > yes I noticed but now only "latest", "octopus" and "nautilus" are > > > offered to be viewed. For older versions I had to go directly to > > > github. > > > > > > Also simply switching the URL from > > > "https://docs.ceph.com/en/nautilus/" to > > > "https://docs.ceph.com/en/luminous/" will not work any more. > > > > > > Is it planned to make the documentation of the older version > > > available again through doc.ceph.com? > > > > > > Best, > > > Martin > > > > > > On Sat, Nov 21, 2020 at 2:11 AM Dan Mick <dmick(a)redhat.com> wrote: > > >> > > >> On 11/14/2020 10:56 AM, Martin Palma wrote: > > >>> Hello, > > >>> > > >>> maybe I missed the announcement but why is the documentation of > > >>> the older ceph version not accessible anymore on docs.ceph.com > > >> > > >> It's changed UI because we're hosting them on readthedocs.com now. > > > >> See the dropdown in the lower right corner. > > >> > > > > > _______________________________________________ > > ceph-users mailing list -- ceph-users(a)ceph.io To unsubscribe send an > > email to ceph-users-leave(a)ceph.io > > _______________________________________________ > > ceph-users mailing list -- ceph-users(a)ceph.io To unsubscribe send an > > email to ceph-users-leave(a)ceph.io > > > > > -- > Steven Pine > > *E * steven.pine(a)webair.com | *P * 516.938.4100 x > *Webair* | 501 Franklin Avenue Suite 200, Garden City NY, 11530 > webair.com > [image: Facebook icon] <https://www.facebook.com/WebairInc/> [image: > Twitter icon] <https://twitter.com/WebairInc> [image: Linkedin icon] > <https://www.linkedin.com/company/webair> > NOTICE: This electronic mail message and all attachments transmitted > with it are intended solely for the use of the addressee and may contain > legally privileged proprietary and confidential information. If the > reader of this message is not the intended recipient, or if you are an > employee or agent responsible for delivering this message to the > intended recipient, you are hereby notified that any dissemination, > distribution, copying, or other use of this message or its attachments > is strictly prohibited. If you have received this message in error, > please notify the sender immediately by replying to this message and > delete it from your computer. > _______________________________________________ > ceph-users mailing list -- ceph-users(a)ceph.io To unsubscribe send an > email to ceph-users-leave(a)ceph.io > > _______________________________________________ > ceph-users mailing list -- ceph-users(a)ceph.io > To unsubscribe send an email to ceph-users-leave(a)ceph.io >

3 years, 5 months

1
0
0 0

Re: Ceph on ARM ?

by Kevin Thorpe

Sorry, I can't compare. Not used Ceph in anger anywhere else. In our case we were looking for Kubernetes on-premises storage and several points led us to the Ambedded solution. I wouldn't expect them to have the sort of throughput of a full size Xeon server but for our immediate purposes that is not really an issue. - It was a turnkey solution with support. We knew very little about either Kubernetes or Ceph and this was the least risk for us. - Size and power. We only have single racks in two datacentres, space is a serious consideration. Ceph is very machine hungry and the alternatives like Softiron even ran to 7U at least. We get 24 micro servers in 3U. Power is only 105W per unit along with little heat. - Cost. These appliances are incredibly inexpensive for the amount of storage they provide. Even the smallest offerings from people like Dell/EMC were both a lot larger and an astonomical amount more expensive. The licencing is purely related to the physical appliances. Hard drives and M.2 cache can be upgraded. You can even run on a single appliance albeit at much reduced resilience and lost capacity. Makes evaluation really inexpensive. - Open source standard. Anything we learn from running these appliances is directly translatable to any Ceph install. Anything we learned on Dell/EMC would be yet more lock in to Dell/EMC. We intend to experiment with Rook in the near future but our inexperience of both Kubernetes and Ceph made this option too risky for the initial stages. If we run Rook properly we think we can be able to co-locate things like databases with their OSD and storage on one server so that performance is optimal while having the management control of Ceph. But the bulk of our data storage is exactly that and doesn't require massive performance. On Wed, 25 Nov 2020 at 09:35, Marc Roos <M.Roos(a)f1-outsourcing.eu> wrote: > > How does ARM compare to Xeon in latency and cluster utilization? > > > >

3 years, 5 months

1
0
0 0

Misleading error (osd has already bound to class) when starting osd on nautilus?

by David Caro

Hi! I have a nautilus ceph cluster, and today I restarted one of the osd daemons and spend some time trying to debug an error I was seeing in the log, though it seems the osd is actually working. The error I was seeing is: ``` Nov 25 09:07:43 osd15 systemd[1]: Starting Ceph object storage daemon osd.44... Nov 25 09:07:43 osd15 systemd[1]: Started Ceph object storage daemon osd.44. Nov 25 09:07:47 osd15 ceph-osd[12230]: 2020-11-25 09:07:47.846 7f55395fbc80 -1 osd.44 106947 log_to_monitors {default=true} Nov 25 09:07:47 osd15 ceph-osd[12230]: 2020-11-25 09:07:47.850 7f55395fbc80 -1 osd.44 106947 mon_cmd_maybe_osd_create fail: 'osd.44 has already bound to class 'ssd', can not reset class to 'hdd'; use 'ceph osd crush rm-device-class <id>' to remove old class first': (16) Device or resource busy ``` There's no other messages in the journal so at first I thought that the osd failed to start. But it seems to be up and working correctly anyhow. There's no "hdd" class in my crush map: ``` # ceph osd crush class ls [ "ssd" ] ``` And that osd is actually of the correct class: ``` # ceph osd crush get-device-class osd.44 ssd ``` ``` # uname -a Linux osd15 4.19.0-9-amd64 #1 SMP Debian 4.19.118-2+deb10u1 (2020-06-07) x86_64 GNU/Linux # ceph --version ceph version 14.2.5-1-g23e76c7aa6 (23e76c7aa6e15817ffb6741aafbc95ca99f24cbb) nautilus (stable) ``` The osd shows up in the cluster and it's receiving load, so there seems to be no problem, but does anyone know what that error is about? Thanks! -- David Caro SRE - Cloud Services Wikimedia Foundation <https://wikimediafoundation.org/> PGP Signature: 7180 83A2 AC8B 314F B4CE 1171 4071 C7E1 D262 69C3 "Imagine a world in which every single human being can freely share in the sum of all knowledge. That's our commitment."

3 years, 5 months

1
0
0 0

S3 Object Lock - ceph nautilus

by Torsten Ennenbach

We are running our S3 Ceph Cluster on Nautilus 14.2.9 and want to use features like Object-lock-enabled. A creation of a bucket is possible with: aws s3api create-bucket --bucket locktest --endpoint http://our-s3 <http://our-s3/> --object-lock-enabled-for-bucket & aws s3api put-object-lock-configuration --bucket locktest --endpoint http://our-s3 <http://our-s3/> --object-lock-configuration '{ "ObjectLockEnabled“: "Enabled", "Rule": { "DefaultRetention": { "Mode": "COMPLIANCE", "Days": 50 }}}‘ Are still deletable, and we don’t know why. Because this feature was backported to 14.2.5 as you can see here: https://github.com/ceph/ceph/pull/29905 <https://github.com/ceph/ceph/pull/29905> Any idea what we are doing wrong? -- Beste Grüße aus Köln Ehrenfeld Torsten Ennenbach Cloud Architect

3 years, 5 months

1
0
0 0

Re: Certificate for Dashboard / Grafana

by E Taka

FYI: I've found a solution for the Grafana Certificate. Just run the following commands: 1. ceph config-key set mgr/cephadm/grafana_crt -i <cert> ceph config-key set mgr/cephadm/grafana_key -i <key> 2. ceph orch redeploy grafana 3. ceph config set mgr mgr/dashboard/GRAFANA_API_URL https://ceph01.domain.tld:3000

3 years, 5 months

1
0
0 0

osd crash: Caught signal (Aborted) thread_name:tp_osd_tp

by Milan Kupcevic

Hello, Three OSD daemons crash at the same time while processing the same object located in an rbd ec4+2 pool leaving a placement group in inactive down state. Soon after I start the osd daemons back up they crash again choking on the same object. ----------------------------8<------------------------------------ _dump_onode 0x5605a27ca000 4#7:8565da11:::rbd_data.6.a8a8356fd674f.00000000003dce34:head# nid 1889617 size 0x100000 (1048576) expected_object_size 0 expected_write_size 0 in 8 shards, 32768 spanning blobs ----------------------------8<------------------------------------ Please take a look at the attached log file. Ceph status reports: Reduced data availability: 1 pg inactive, 1 pg down Any hints on how to get this placement group back online would be greatly appreciated. Milan -- Milan Kupcevic Senior Cyberinfrastructure Engineer at Project NESE Harvard University FAS Research Computing

3 years, 5 months

2
4
0 0

smartctl UNRECOGNIZED OPTION: json=o

by Tony Liu

Hi, With Ceph Octopus 15.2.5, here is the output of command "ceph device get-health-metrics SEAGATE_DL2400MM0159_WBM2WP2S". =============================== "20201123-000939": { "dev": "/dev/sde", "error": "smartctl failed", "nvme_smart_health_information_add_log_error": "nvme returned an error: sudo: exit status: 231", "nvme_smart_health_information_add_log_error_code": -22, "nvme_vendor": "seagate", "smartctl_error_code": -22, "smartctl_output": "smartctl returned an error (1): stderr:\nsudo: exit status: 1\nstdout:\nsmartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.18.0-147.el8.x86_64] (local build)\nCopyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org\n\n=======> UNRECOGNIZED OPTION: json=o\n\nUse smartctl -h to get a usage summary\n\n" }, =============================== Is this error expected? The disk is SAS HDD, why those keywords with "nvme_" prefix? Is the same keywords used for all types of disk? Where can I see the command line options for running smartctl? Thanks! Tony

3 years, 5 months

3
2
0 0

2024

2023

2022

2021

2020

2019

ceph-users November 2020