July 2020 - ceph-users - lists.ceph.io

Showing OSD Disk config?

by Lindsay Mathieson

Is there a way to display an OSD's setup - data, data.db and WAL disks/partitions? -- Lindsay

3 years, 9 months

3
3
0 0

Re: Ceph OSD not mounting after reboot

by Rodrigo Severo - Fábrica

Em sáb., 4 de jul. de 2020 às 11:27, Dave Hall <kdhall(a)binghamton.edu> escreveu: > > Rodrigo, > > I tried to send this to the list last night, but it looks like it didn't go through. I had a problem very much like this when I was first setting up my cluster. It turned out to be a missing systemd unit file. I would suggest that you check to see that you have an instance of ceph-volume@.service and an instance of ceph-osd@.service running for each OSD. I manually started my ceph-volume@ instances and then I managed to successfully restart my ceph-osd@ instances. My OSDs are back. Does anybody knows how is responsible for calling the ceph-volume@ instances on boot? Regards, Rodrigo

3 years, 9 months

1
0
0 0

Re: Ceph OSD not mounting after reboot

by Rodrigo Severo - Fábrica

Em sáb., 4 de jul. de 2020 às 11:27, Dave Hall <kdhall(a)binghamton.edu> escreveu: > > Rodrigo, > > I tried to send this to the list last night, but it looks like it didn't go through. I had a problem very much like this when I was first setting up my cluster. It turned out to be a missing systemd unit file. I would suggest that you check to see that you have an instance of ceph-volume@.service and an instance of ceph-osd@.service running for each OSD. > > ceph-volume(a)lvm-10-a63a2465-9de2-497e-bf60-72ed4c3c4c33.service > ceph-volume(a)lvm-11-a2151f1f-84f5-407b-a730-9d2a9502a85f.service > ceph-volume(a)lvm-12-7a16123b-7a0f-40a2-9159-5555724d9978.service > ceph-volume(a)lvm-13-bda32c07-ecc7-40c4-8066-383977c5e795.service > ceph-volume(a)lvm-14-4cc89d3a-5390-415d-b12b-9a557b7b5950.service > ceph-volume(a)lvm-15-21bbdaf5-1994-43a9-aa80-982689f51438.service > ceph-volume(a)lvm-8-cda4394b-e132-4530-8044-3cbbfbcbea19.service > ceph-volume(a)lvm-9-72ff199d-a4e4-4d26-9a4f-00337f8fdc7c.service > > ceph-osd(a)10.service > ceph-osd(a)12.service > ceph-osd(a)14.service > ceph-osd(a)8.service > ceph-osd(a)11.service > ceph-osd(a)13.service > ceph-osd(a)15.service > ceph-osd(a)9.service > > In my case I found that the proto-unit for one of these (ceph-volume@.service, I think) was missing from /var/lib/systemd/system. In fact, it was missing from the Debian install package for some reason. I think I had to unpack a copy of the Debian SRC pacakge to retrieve the file. Once I added the missing file and made sure the unit was enabled, things started working better and making more sense. I don't recall if all of the instances created themselves or whether I had to do something addtional, but it worked. Hi Dave, I'm looking around but can't identify any missing file. ceph-osd@.service and ceph-volume@.service are present. I'm looking on the other servers for some other file that might be missing but can't find anything. Thanks for your help, Rodrigo

3 years, 9 months

1
0
0 0

Re: Ceph OSD not mounting after reboot

by Rodrigo Severo - Fábrica

Em sex., 3 de jul. de 2020 às 17:41, Marc Roos <M.Roos(a)f1-outsourcing.eu> escreveu: > > So mount it, if it is empty Sure. That was my first impulse but as I said, in my other osd servers, these mounts are tmpfs filesystems. It's easy to manually mount them but how would I populate them? Regards, Rodrigo Severo > > > > -----Original Message----- > To: ceph-users > Subject: [ceph-users] Ceph OSD not mounting after reboot > > Hi, > > > Just rebooted one of my OSD servers after upgrading Ceph from 14.2.9 to > 14.2.10 and it's OSDs won't come up. > > I find the following messages on my log: > > 4991 Jul 3 17:24:03 osdserver1-df ceph-osd[1272]: 2020-07-03 > 17:24:03.036 7fcc497f1c00 -1 auth: unable to find a keyring on > /var/lib/ceph/osd/ceph-6/keyring: (2) No such file or directory > 4992 Jul 3 17:24:03 osdserver1-df ceph-osd[1272]: 2020-07-03 > 17:24:03.036 7fcc497f1c00 -1 AuthRegistry(0x55e2ff810140) no keyring > found at /var/lib/ceph/osd/ceph-6/keyring, disabling cephx > > and my /var/lib/ceph/osd/ceph-6 directory is empty. > > I see that on my other servers these /var/lib/ceph/osd/ceph-? > directories are tmpfs mounts but I can't understand who is responsible > for mounting them as there are no entries for them in /etc/fstab. > > How can I fix this osd server? > > > Regards, > > Rodrigo Severo > _______________________________________________ > ceph-users mailing list -- ceph-users(a)ceph.io To unsubscribe send an > email to ceph-users-leave(a)ceph.io > >

3 years, 9 months

2
1
0 0

Ceph OSD not mounting after reboot

by Rodrigo Severo - Fábrica

Hi, Just rebooted one of my OSD servers after upgrading Ceph from 14.2.9 to 14.2.10 and it's OSDs won't come up. I find the following messages on my log: 4991 Jul 3 17:24:03 osdserver1-df ceph-osd[1272]: 2020-07-03 17:24:03.036 7fcc497f1c00 -1 auth: unable to find a keyring on /var/lib/ceph/osd/ceph-6/keyring: (2) No such file or directory 4992 Jul 3 17:24:03 osdserver1-df ceph-osd[1272]: 2020-07-03 17:24:03.036 7fcc497f1c00 -1 AuthRegistry(0x55e2ff810140) no keyring found at /var/lib/ceph/osd/ceph-6/keyring, disabling cephx and my /var/lib/ceph/osd/ceph-6 directory is empty. I see that on my other servers these /var/lib/ceph/osd/ceph-? directories are tmpfs mounts but I can't understand who is responsible for mounting them as there are no entries for them in /etc/fstab. How can I fix this osd server? Regards, Rodrigo Severo

3 years, 9 months

4
4
0 0

[Octopus] OSD won’t work with Docker

by Sean Johnson

I have a situation were OSDs won’t work as Docker containers with Octopus on an Ubuntu 2020.04 host. The cephadm adopt —style legacy —name osd.8 command works as expected, and sets up the /var/lib/ceph/ directory as expected: root@balin:~# ll /var/lib/ceph/c3d06c94-bb66-4f84-bf78-470a2364b667/osd.8/ total 64 drwx------ 2 167 167 4096 Jul 3 08:39 ./ drwx------ 8 167 167 4096 Jul 3 08:30 ../ lrwxrwxrwx 1 167 167 93 Jul 3 08:39 block -> /dev/ceph-590abe6f-abac-46b7-9455-80f69b63cf89/osd-block-5b229640-14e5-4e7f-9993-9368d172c30c -rw------- 1 167 167 37 Jul 3 08:39 ceph_fsid -rw------- 1 167 167 283 Jul 3 08:14 config -rw------- 1 167 167 37 Jul 3 08:39 fsid -rw------- 1 167 167 55 Jul 3 08:39 keyring -rw------- 1 167 167 6 Jul 3 08:39 ready -rw------- 1 167 167 3 May 14 21:41 require_osd_release -rw------- 1 167 167 10 Jul 3 08:39 type -rw------- 1 167 167 38 Jul 3 08:14 unit.configured -rw------- 1 167 167 48 May 14 21:41 unit.created -rw------- 1 167 167 28 Jul 3 08:14 unit.image -rw------- 1 167 167 825 Jul 3 08:14 unit.poststop -rw------- 1 167 167 2367 Jul 3 08:14 unit.run -rw------- 1 167 167 2 Jul 3 08:39 whoami However, the problem shows up when the container starts. Jul 03 08:39:40 balin systemd[1]: Started Ceph osd.8 for c3d06c94-bb66-4f84-bf78-470a2364b667. Jul 03 08:39:41 balin bash[5412]: Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-8 Jul 03 08:39:41 balin bash[5412]: Running command: /usr/bin/ceph-bluestore-tool --cluster=ceph prime-osd-dir --dev /dev/ceph-590abe6f-abac-46b7-9455-80f69b63cf89/osd-blo> Jul 03 08:39:41 balin bash[5412]: Running command: /usr/bin/ln -snf /dev/ceph-590abe6f-abac-46b7-9455-80f69b63cf89/osd-block-5b229640-14e5-4e7f-9993-9368d172c30c /var/li> Jul 03 08:39:41 balin bash[5412]: Running command: /usr/bin/chown -h ceph:ceph /var/lib/ceph/osd/ceph-8/block Jul 03 08:39:41 balin bash[5412]: Running command: /usr/bin/chown -R ceph:ceph /dev/dm-1 Jul 03 08:39:41 balin bash[5412]: Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-8 Jul 03 08:39:41 balin bash[5412]: --> ceph-volume lvm activate successful for osd ID: 8 Jul 03 08:39:41 balin bash[5482]: debug 2020-07-03T13:39:41.596+0000 7f52b1e70f40 0 set uid:gid to 167:167 (ceph:ceph) Jul 03 08:39:41 balin bash[5482]: debug 2020-07-03T13:39:41.596+0000 7f52b1e70f40 0 ceph version 15.2.4 (7447c15c6ff58d7fce91843b705a268a1917325c) octopus (stable), pro> Jul 03 08:39:41 balin bash[5482]: debug 2020-07-03T13:39:41.596+0000 7f52b1e70f40 0 pidfile_write: ignore empty --pid-file Jul 03 08:39:41 balin bash[5482]: debug 2020-07-03T13:39:41.600+0000 7f52b1e70f40 1 bdev create path /var/lib/ceph/osd/ceph-8/block type kernel Jul 03 08:39:41 balin bash[5482]: debug 2020-07-03T13:39:41.600+0000 7f52b1e70f40 1 bdev(0x55b992cd8000 /var/lib/ceph/osd/ceph-8/block) open path /var/lib/ceph/osd/ceph> Jul 03 08:39:41 balin bash[5482]: debug 2020-07-03T13:39:41.600+0000 7f52b1e70f40 1 bdev(0x55b992cd8000 /var/lib/ceph/osd/ceph-8/block) open size 12000134430720 (0xae9f> Jul 03 08:39:41 balin bash[5482]: debug 2020-07-03T13:39:41.600+0000 7f52b1e70f40 1 bluestore(/var/lib/ceph/osd/ceph-8) _set_cache_sizes cache_size 1073741824 meta 0.4 > Jul 03 08:39:41 balin bash[5482]: debug 2020-07-03T13:39:41.600+0000 7f52b1e70f40 1 bdev create path /var/lib/ceph/osd/ceph-8/block type kernel Jul 03 08:39:41 balin bash[5482]: debug 2020-07-03T13:39:41.600+0000 7f52b1e70f40 1 bdev(0x55b992cd8700 /var/lib/ceph/osd/ceph-8/block) open path /var/lib/ceph/osd/ceph> Jul 03 08:39:41 balin bash[5482]: debug 2020-07-03T13:39:41.600+0000 7f52b1e70f40 1 bdev(0x55b992cd8700 /var/lib/ceph/osd/ceph-8/block) open size 12000134430720 (0xae9f> Jul 03 08:39:41 balin bash[5482]: debug 2020-07-03T13:39:41.600+0000 7f52b1e70f40 1 bluefs add_block_device bdev 1 path /var/lib/ceph/osd/ceph-8/block size 11 TiB Jul 03 08:39:41 balin bash[5482]: debug 2020-07-03T13:39:41.600+0000 7f52b1e70f40 1 bdev(0x55b992cd8700 /var/lib/ceph/osd/ceph-8/block) close Jul 03 08:39:41 balin bash[5482]: debug 2020-07-03T13:39:41.888+0000 7f52b1e70f40 1 bdev(0x55b992cd8000 /var/lib/ceph/osd/ceph-8/block) close Jul 03 08:39:42 balin bash[5482]: debug 2020-07-03T13:39:42.141+0000 7f52b1e70f40 0 starting osd.8 osd_data /var/lib/ceph/osd/ceph-8 /var/lib/ceph/osd/ceph-8/journal Jul 03 08:39:42 balin bash[5482]: debug 2020-07-03T13:39:42.153+0000 7f52b1e70f40 0 load: jerasure load: lrc load: isa Jul 03 08:39:42 balin bash[5482]: debug 2020-07-03T13:39:42.153+0000 7f52b1e70f40 1 bdev create path /var/lib/ceph/osd/ceph-8/block type kernel Jul 03 08:39:42 balin bash[5482]: debug 2020-07-03T13:39:42.153+0000 7f52b1e70f40 1 bdev(0x55b992cd8000 /var/lib/ceph/osd/ceph-8/block) open path /var/lib/ceph/osd/ceph> Jul 03 08:39:42 balin bash[5482]: debug 2020-07-03T13:39:42.153+0000 7f52b1e70f40 -1 bdev(0x55b992cd8000 /var/lib/ceph/osd/ceph-8/block) open open got: (13) Permission d> Jul 03 08:39:42 balin bash[5482]: debug 2020-07-03T13:39:42.157+0000 7f52b1e70f40 0 osd.8:0.OSDShard using op scheduler ClassedOpQueueScheduler(queue=WeightedPriorityQu> Jul 03 08:39:42 balin bash[5482]: debug 2020-07-03T13:39:42.157+0000 7f52b1e70f40 0 osd.8:1.OSDShard using op scheduler ClassedOpQueueScheduler(queue=WeightedPriorityQu> Jul 03 08:39:42 balin bash[5482]: debug 2020-07-03T13:39:42.157+0000 7f52b1e70f40 0 osd.8:2.OSDShard using op scheduler ClassedOpQueueScheduler(queue=WeightedPriorityQu> Jul 03 08:39:42 balin bash[5482]: debug 2020-07-03T13:39:42.157+0000 7f52b1e70f40 0 osd.8:3.OSDShard using op scheduler ClassedOpQueueScheduler(queue=WeightedPriorityQu> Jul 03 08:39:42 balin bash[5482]: debug 2020-07-03T13:39:42.157+0000 7f52b1e70f40 0 osd.8:4.OSDShard using op scheduler ClassedOpQueueScheduler(queue=WeightedPriorityQu> Jul 03 08:39:42 balin bash[5482]: debug 2020-07-03T13:39:42.157+0000 7f52b1e70f40 -1 bluestore(/var/lib/ceph/osd/ceph-8/block) _read_bdev_label failed to open /var/lib/c> Jul 03 08:39:42 balin bash[5482]: debug 2020-07-03T13:39:42.157+0000 7f52b1e70f40 1 bluestore(/var/lib/ceph/osd/ceph-8) _mount path /var/lib/ceph/osd/ceph-8 Jul 03 08:39:42 balin bash[5482]: debug 2020-07-03T13:39:42.157+0000 7f52b1e70f40 -1 bluestore(/var/lib/ceph/osd/ceph-8/block) _read_bdev_label failed to open /var/lib/c> Jul 03 08:39:42 balin bash[5482]: debug 2020-07-03T13:39:42.157+0000 7f52b1e70f40 1 bdev create path /var/lib/ceph/osd/ceph-8/block type kernel Jul 03 08:39:42 balin bash[5482]: debug 2020-07-03T13:39:42.157+0000 7f52b1e70f40 1 bdev(0x55b992cd8000 /var/lib/ceph/osd/ceph-8/block) open path /var/lib/ceph/osd/ceph> Jul 03 08:39:42 balin bash[5482]: debug 2020-07-03T13:39:42.157+0000 7f52b1e70f40 -1 bdev(0x55b992cd8000 /var/lib/ceph/osd/ceph-8/block) open open got: (13) Permission d> Jul 03 08:39:42 balin bash[5482]: debug 2020-07-03T13:39:42.157+0000 7f52b1e70f40 -1 osd.8 0 OSD:init: unable to mount object store Jul 03 08:39:42 balin bash[5482]: debug 2020-07-03T13:39:42.157+0000 7f52b1e70f40 -1 ** ERROR: osd init failed: (13) Permission denied Jul 03 08:39:42 balin systemd[1]: ceph-c3d06c94-bb66-4f84-bf78-470a2364b667(a)osd.8.service: Main process exited, code=exited, status=1/FAILURE Jul 03 08:39:42 balin systemd[1]: ceph-c3d06c94-bb66-4f84-bf78-470a2364b667(a)osd.8.service: Failed with result 'exit-code'. Jul 03 08:39:53 balin systemd[1]: ceph-c3d06c94-bb66-4f84-bf78-470a2364b667(a)osd.8.service: Scheduled restart job, restart counter is at 6. Jul 03 08:39:53 balin systemd[1]: Stopped Ceph osd.8 for c3d06c94-bb66-4f84-bf78-470a2364b667. Jul 03 08:39:53 balin systemd[1]: ceph-c3d06c94-bb66-4f84-bf78-470a2364b667(a)osd.8.service: Start request repeated too quickly. Jul 03 08:39:53 balin systemd[1]: ceph-c3d06c94-bb66-4f84-bf78-470a2364b667(a)osd.8.service: Failed with result 'exit-code'. Jul 03 08:39:53 balin systemd[1]: Failed to start Ceph osd.8 for c3d06c94-bb66-4f84-bf78-470a2364b667. There’s a permission denied issue with the block, and I can’t find what might be causing that. If I set up an OSD fresh, it works fine with the docker container until I reboot, and then I get the same permission denied error. If I use ceph-volume lvm activate a standard ceph systemd service is created, and the OSD comes online without any problem, though it’s not visible to ceph orch.

3 years, 9 months

1
1
0 0

Object Gateway not working within the dashboard anymore after network change

by Hendrik Peyerl

Hi all, we are currently experiencing a problem with the Obejct Gateway part of the dashboard not working anymore: We had a working setup were the RGW servers only had 1 network interface with an IP address that was reachable by the monitor servers and the dashboard was working as expected. After our initial tests everything was working great and we decided to add another physical link to the RGW Servers for the traffic to the clients. With that network change we also had to set the default gateway to the new interface while adding static routes for the rest of the ceph environment. To avoid issues with hostnames (the old hostname now resolves to the new interface) we added another hostname for the internal traffic, purged the gateways from ceph and added them again via ceph-deploy rgw create with the new hostname. The S3 communication is working perfectly fine as it did before, we can reach all buckets and the monitors can communicate with the Gateway. The Dashboard however throws the following error whenever we navigate to any of the object gateway menus: ————————————————————————————————— 2020-07-03 10:33:41.871 7fa0f9dbc700 0 mgr[dashboard] [03/Jul/2020:10:33:41] HTTP Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/cherrypy/_cprequest.py", line 656, in respond response.body = self.handler() File "/usr/lib/python2.7/site-packages/cherrypy/lib/encoding.py", line 188, in __call__ self.body = self.oldhandler(*args, **kwargs) File "/usr/lib/python2.7/site-packages/cherrypy/_cptools.py", line 221, in wrap return self.newhandler(innerfunc, *args, **kwargs) File "/usr/share/ceph/mgr/dashboard/services/exception.py", line 88, in dashboard_exception_handler return handler(*args, **kwargs) File "/usr/lib/python2.7/site-packages/cherrypy/_cpdispatch.py", line 34, in __call__ return self.callable(*self.args, **self.kwargs) File "/usr/share/ceph/mgr/dashboard/controllers/__init__.py", line 661, in inner ret = func(*args, **kwargs) File "/usr/share/ceph/mgr/dashboard/controllers/rgw.py", line 28, in status if not instance.is_service_online(): File "/usr/share/ceph/mgr/dashboard/rest_client.py", line 507, in func_wrapper **kwargs) File "/usr/share/ceph/mgr/dashboard/services/rgw_client.py", line 321, in is_service_online _ = request({'format': 'json'}) File "/usr/share/ceph/mgr/dashboard/rest_client.py", line 313, in __call__ data, raw_content) File "/usr/share/ceph/mgr/dashboard/rest_client.py", line 445, in do_request ex.args[0].reason.args[0]) File "/usr/lib64/python2.7/re.py", line 137, in match return _compile(pattern, flags).match(string) TypeError: expected string or buffer 2020-07-03 10:33:41.872 7fa0f9dbc700 0 mgr[dashboard] [2a02:2e0:13::a05:42784] [GET] [500] [45.044s] [plusline] [1.8K] /api/rgw/status 2020-07-03 10:33:41.872 7fa0f9dbc700 0 mgr[dashboard] ['{"status": "500 Internal Server Error", "version": "3.2.2", "traceback": "Traceback (most recent call last):\\n File \\"/usr/lib/python2.7/site- packages/cherrypy/_cprequest.py\\", line 656, in respond\\n response.body = self.handler()\\n File \\"/usr/lib/python2.7/site-packages/cherrypy/lib/encoding.py\\", line 188, in __call__\\n self.b ody = self.oldhandler(*args, **kwargs)\\n File \\"/usr/lib/python2.7/site-packages/cherrypy/_cptools.py\\", line 221, in wrap\\n return self.newhandler(innerfunc, *args, **kwargs)\\n File \\"/usr/s hare/ceph/mgr/dashboard/services/exception.py\\", line 88, in dashboard_exception_handler\\n return handler(*args, **kwargs)\\n File \\"/usr/lib/python2.7/site-packages/cherrypy/_cpdispatch.py\\", l ine 34, in __call__\\n return self.callable(*self.args, **self.kwargs)\\n File \\"/usr/share/ceph/mgr/dashboard/controllers/__init__.py\\", line 661, in inner\\n ret = func(*args, **kwargs)\\n F ile \\"/usr/share/ceph/mgr/dashboard/controllers/rgw.py\\", line 28, in status\\n if not instance.is_service_online():\\n File \\"/usr/share/ceph/mgr/dashboard/rest_client.py\\", line 507, in func_w rapper\\n **kwargs)\\n File \\"/usr/share/ceph/mgr/dashboard/services/rgw_client.py\\", line 321, in is_service_online\\n _ = request({\'format\': \'json\'})\\n File \\"/usr/share/ceph/mgr/dashb oard/rest_client.py\\", line 313, in __call__\\n data, raw_content)\\n File \\"/usr/share/ceph/mgr/dashboard/rest_client.py\\", line 445, in do_request\\n ex.args[0].reason.args[0])\\n File \\"/usr/lib64/python2.7/re.py\\", line 137, in match\\n return _compile(pattern, flags).match(string)\\nTypeError: expected string or buffer\\n", "detail": "The server encountered an unexpected condition which prevented it from fulfilling the request.", "request_id": "e0d6ff11-4dad-496a-9ee7-9db036c46ab7"}'] ————————————————————————————————— We are running ceph version 14.2.9 on CentOS 7.7. Any help on how to debug this would be greatly apreciated. Best Regards, Hendrik

3 years, 9 months

2
3
0 0

Cannot remove cache tier

by Alexander E. Patrakov

Hello. I have tried to follow through the documented writeback cache tier removal procedure (https://docs.ceph.com/docs/master/rados/operations/cache-tiering/#removing-…) on a test cluster, and failed. I have successfully executed this command: ceph osd tier cache-mode alex-test-rbd-cache proxy Next, I am supposed to run this: rados -p alex-test-rbd-cache ls rados -p alex-test-rbd-cache cache-flush-evict-all The failure mode is that, while the client i/o still going on, I cannot get zero objects in the cache pool, even with the help of "rados -p alex-test-rbd-cache cache-flush-evict-all". And yes, I have waited more than 20 minutes (my cache tier has hit_set_count 10 and hit_set_period 120). I also tried to set both cache_target_dirty_ratio and cache_target_full_ratio to 0, it didn't help. Here is the relevant part of the pool setup: # ceph osd pool ls detail pool 25 'alex-test-rbd-metadata' replicated size 3 min_size 2 crush_rule 9 object_hash rjenkins pg_num 64 pgp_num 64 autoscale_mode warn last_change 10973111 lfor 0/10971347/10971345 flags hashpspool,nodelete stripe_width 0 application rbd pool 26 'alex-test-rbd-data' erasure size 6 min_size 5 crush_rule 12 object_hash rjenkins pg_num 1024 pgp_num 1024 autoscale_mode warn last_change 10973112 lfor 10971705/10971705/10971705 flags hashpspool,ec_overwrites,nodelete,selfmanaged_snaps tiers 27 read_tier 27 write_tier 27 stripe_width 16384 application rbd removed_snaps [1~3] pool 27 'alex-test-rbd-cache' replicated size 3 min_size 2 crush_rule 9 object_hash rjenkins pg_num 64 pgp_num 64 autoscale_mode warn last_change 10973113 lfor 10971705/10971705/10971705 flags hashpspool,incomplete_clones,nodelete,selfmanaged_snaps tier_of 26 cache_mode proxy target_bytes 10000000000 hit_set bloom{false_positive_probability: 0.05, target_size: 0, seed: 0} 120s x10 decay_rate 0 search_last_n 0 stripe_width 0 application rbd removed_snaps [1~3] The relevant crush rules are selecting ssds for the alex-test-rbd-cache and alex-test-rbd-metadata pools (plain old "replicated size 3" pools), and hdds for alex-test-rbd-data (which is EC 4+2). The client workload, which seemingly outpaces the eviction and flushing, is: for a in `seq 1000 2000` ; do time rbd import --data-pool alex-test-rbd-data ./Fedora-Cloud-Base-32-1.6.x86_64.raw alex-test-rbd-metadata/Fedora-copy-$a done The ceph version is "ceph version 14.2.9 (2afdc1f644870fb6315f25a777f9e4126dacc32d) nautilus (stable)" on all osds. The relevant part of "ceph df" is: RAW STORAGE: CLASS SIZE AVAIL USED RAW USED %RAW USED hdd 23 TiB 20 TiB 2.9 TiB 3.0 TiB 12.99 ssd 1.7 TiB 1.7 TiB 19 GiB 23 GiB 1.28 TOTAL 25 TiB 22 TiB 2.9 TiB 3.0 TiB 12.17 POOLS: POOL ID STORED OBJECTS USED %USED MAX AVAIL <irrelevant pools omitted> alex-test-rbd-metadata 25 237 KiB 2.37k 59 MiB 0 564 GiB alex-test-rbd-data 26 691 GiB 198.57k 1.0 TiB 6.52 9.7 TiB alex-test-rbd-cache 27 5.1 GiB 2.99k 15 GiB 0.90 564 GiB The total size and the number of stored objects in the alex-test-rbd-cache pool oscillate around 5 GB and 3K, respectively, while "rados -p alex-test-rbd-cache cache-flush-evict-all" is running in a loop. Without it, the size grows to 6 GB and stays there. # ceph -s cluster: id: <omitted for privacy> health: HEALTH_WARN 1 cache pools at or near target size services: mon: 3 daemons, quorum xx-4a,xx-3a,xx-2a (age 10d) mgr: xx-3a(active, since 5w), standbys: xx-2b, xx-2a, xx-4a mds: cephfs:1 {0=xx-4b=up:active} 2 up:standby osd: 89 osds: 89 up (since 7d), 89 in (since 7d) rgw: 3 daemons active (xx-2b, xx-3b, xx-4b) tcmu-runner: 6 daemons active (<only irrelevant images here>) data: pools: 15 pools, 1976 pgs objects: 6.64M objects, 1.3 TiB usage: 3.1 TiB used, 22 TiB / 25 TiB avail pgs: 1976 active+clean io: client: 290 KiB/s rd, 251 MiB/s wr, 366 op/s rd, 278 op/s wr cache: 123 MiB/s flush, 72 MiB/s evict, 31 op/s promote, 3 PGs flushing, 1 PGs evicting Is there any workaround, short of somehow telling the client to stop creating new rbds? -- Alexander E. Patrakov CV: http://pc.cd/PLz7

3 years, 9 months

1
0
0 0

Re: NFS Ganesha 2.7 in Xenial not available

by Victoria Martinez de la Cruz

Thanks Ramana and David. So we are using the Shaman search API to get the latest build for ceph_nautilus flavor of NFS Ganesha, and that's how we get to the mentioned build. We are doing this since it's part of our CI and it's better for automation. Should we use different repos? Thanks, V On Wed, Jun 24, 2020 at 3:33 PM Victoria Martinez de la Cruz < vkmc(a)redhat.com> wrote: > Thanks Ramana and David. > > So we are using the Shaman search API to get the latest build for > ceph_nautilus flavor of NFS Ganesha, and that's how we get to the mentioned > build. We are doing this since it's part of our CI and it's better for > automation. > > Should we use different repos? > > Thanks, > > V > > On Tue, Jun 23, 2020 at 2:42 PM David Galloway <dgallowa(a)redhat.com> > wrote: > >> >> >> On 6/23/20 1:21 PM, Ramana Venkatesh Raja wrote: >> > On Tue, Jun 23, 2020 at 6:59 PM Victoria Martinez de la Cruz >> > <victoria(a)redhat.com> wrote: >> >> >> >> Hi folks, >> >> >> >> I'm hitting issues with the nfs-ganesha-stable packages [0], the repo >> url >> >> [1] is broken. Is there a known issue for this? >> >> >> > >> > The missing packages in chacra could be due to the recent mishap in >> > the sepia long running cluster, >> > >> https://lists.ceph.io/hyperkitty/list/dev@ceph.io/thread/YQMAHTB7MUHL25QP7V… >> >> Hi Victoria, >> >> Ramana is correct. Do you need 2.7.4 specifically? If not, signed >> nfs-ganesha packages can also be found here: >> http://download.ceph.com/nfs-ganesha/ >> >> > >> >> Thanks, >> >> >> >> Victoria >> >> >> >> [0] >> >> >> https://shaman.ceph.com/repos/nfs-ganesha-stable/V2.7-stable/1a1fb71cdb811c… >> >> [1] >> >> >> https://chacra.ceph.com/r/nfs-ganesha-stable/V2.7-stable/1a1fb71cdb811c1bac… >> >> _______________________________________________ >> >> ceph-users mailing list -- ceph-users(a)ceph.io >> >> To unsubscribe send an email to ceph-users-leave(a)ceph.io >> >> >> > >> >>

3 years, 9 months

3
3
0 0

rbd audit

by Seena Fallah

Hi all. Is there any rbd audit like ceph.audit.log that could log which client runs which command from rbd client? Thanks.

3 years, 9 months

1
0
0 0

2024

2023

2022

2021

2020

2019

ceph-users July 2020