September 2019 - ceph-users

by gaving＠voxtandem.com

I created a cluster on Nautilus 14.2.0, then upgraded to 14.2.1, and finally 14.2.3 Now I am seeing this warning that I thought should only appear if the cluster was created pre-Nautilus. Legacy BlueStore stats reporting detected on XX OSD I can't seem to find any information about this on a cluster that was always on 14.2 Gavin

4 years, 7 months

1
0
0 0

Bug identified: Dashboard proxy configuration is not working as expected

by Thomas

Hi, I have successfully configured Ceph dashboard following the this <https://docs.ceph.com/docs/master/mgr/dashboard> documentation. According to the documentation you can configure a URL prefix with this command: ceph config set mgr mgr/dashboard/url_prefix $PREFIX However when I try to access the dashboard with this URL http://$IP:$PORT/$PREFIX/ I get an error: {"status": "404 Not Found", "version": "8.9.1", "detail": "The path '/dashboard' was not found.", "traceback": "Traceback (most recent call last):\n File \"/usr/lib/python2.7/dist-packages/cherrypy/_cprequest.py\", line 670, in respond\n response.body = self.handler()\n File \"/usr/lib/python2.7/dist-packages/cherrypy/lib/encoding.py\", line 220, in __call__\n self.body = self.oldhandler(*args, **kwargs)\n File \"/usr/lib/python2.7/dist-packages/cherrypy/_cperror.py\", line 415, in __call__\n raise self\nNotFound: (404, \"The path '/dashboard' was not found.\")\n"} A workaround for this error is to use this URL: http://$IP:$PORT/#/$PREFIX/ Using the # ensures that redirect to active MGR service is working. Regards Thomas

4 years, 7 months

2
1
0 0

645% Clean PG's in Dashboard

by c.lilja＠falseprivacy.org

Hi, I've upgraded to Nautilus from Mimic a while ago and enabled the pg_autoscaler. When pg_autoscaler was activated I got a HEALTH_WARN regarding: POOL_TARGET_SIZE_BYTES_OVERCOMMITTED 1 subtrees have overcommitted pool target_size_bytes Pools ['cephfs_data_reduced', 'cephfs_data', 'cephfs_metadata'] overcommit available storage by 1.460x due to target_size_bytes 0 on pools [] POOL_TARGET_SIZE_RATIO_OVERCOMMITTED 1 subtrees have overcommitted pool target_size_ratio Pools ['cephfs_data_reduced', 'cephfs_data', 'cephfs_metadata'] overcommit available storage by 1.460x due to target_size_ratio 0.000 on pools [] Both target_size_bytes and target_size_ratio on all the pools are set to 0, so I started to wonder why this error message appear. My autoscale-status looks like this: POOL SIZE TARGET SIZE RATE RAW CAPACITY RATIO TARGET RATIO BIAS PG_NUM NEW PG_NUM AUTOSCALE cephfs_metadata 16708M 4.0 34465G 0.0019 1.0 8 warn cephfs_data_reduced 15506G 2.0 34465G 0.8998 1.0 375 warn cephfs_data 6451G 3.0 34465G 0.5616 1.0 250 warn So the ratio in total is 1.4633.. Isn't 1.0 of the combined ratio of all pools equal of full? I also enabled the Dashboard and saw that the PG Status showed "645% clean" PG's. This cluster was originally installed with version Jewel, so may it be any legacy setting or such that causing this?

4 years, 7 months

2
1
0 0

the different between flag system and admin when create user rgw

by Wahyu Muqsita

when create user rgw using command : radosgw-admin user create --uid={username} --display-name="{display-name}" [--email={email}] There are 2 flag that I can use —system and —admin. What is this flag for ? -- ------------------------------------------------------------ Wahyu Muqsita Wardana System Engineer ------------------------------------------------------------ Jl. Ampera Raya Nomor 22, Cilandak Timur Jakarta Selatan 12560, Indonesia. T. +62217182008 | M. +62 8227 3185 744 www.bukalapak.com This message and any attachments are confidential and intended solely for the use of the individual to whom it is addressed.

4 years, 7 months

2
1
0 0

How to use radosgw-min find ?

by EDH - Manuel Rios Fernandez

Hi! We're looking to mantain our rgw pools out of orphans objects, checking the documentation and mailist is not really clear how it works and what will do. Radosgw-admin orphands find -pool= --job-id= Loops over all objects in the cluster looking for leaked objects and add it to a shard in the pool rgw.log. For us after more than 72 hours running stuck with 24 GB ram used Console show : 7fc6ca719700 0 run(): building index of all bucket indexes 7fc6ca719700 0 run(): building index of all linked objects 7fc6ca719700 0 building linked oids index: 0/64 7fc6ca719700 0 building linked oids index: 1/64 Checking the RGW.log pool it generated 64 largeomaps log file. Anyone got experience with orphans objects? We calculated near 80-100TB orphans objects in our cluster. Regards Manuel

4 years, 7 months

1
0
0 0

CentOS deps for ceph-mgr-diskprediction-local

by Dan van der Ster

Hi there, Did anyone get the mgr diskprediction-local plugin working on CentOS ? When I enable the plugin with v14.2.3 I get: HEALTH_ERR 2 mgr modules have failed MGR_MODULE_ERROR 2 mgr modules have failed Module 'devicehealth' has failed: Failed to import _strptime because the import lockis held by another thread. Module 'diskprediction_local' has failed: No module named sklearn.svm.classes When the package is installed it brings in several deps but apparently these are not enough? Installing: ceph-mgr-diskprediction-local noarch 2:14.2.3-0.el7 ceph-noarch 1.1 M Installing for dependencies: atlas x86_64 3.10.1-12.el7 base 4.5 M blas x86_64 3.4.2-8.el7 base 399 k lapack x86_64 3.4.2-8.el7 base 5.4 M libgfortran x86_64 4.8.5-39.el7 cr 300 k libquadmath x86_64 4.8.5-39.el7 cr 190 k numpy x86_64 1:1.7.1-13.el7 base 2.8 M numpy-f2py x86_64 1:1.7.1-13.el7 base 206 k python-devel x86_64 2.7.5-86.el7 cr 398 k python-nose noarch 1.3.7-1.el7 base 276 k python-rpm-macros noarch 3-32.el7 cr 8.8 k python-srpm-macros noarch 3-32.el7 cr 8.4 k python2-rpm-macros noarch 3-32.el7 cr 7.7 k scipy x86_64 0.12.1-6.el7 base 9.3 M suitesparse x86_64 4.0.2-10.el7 base 928 k tbb x86_64 4.1-9.20130314.el7 base 124 k I've seen https://tracker.ceph.com/issues/38088 but didn't find the sklearn package in any standard repo. Thanks! Dan

4 years, 7 months

1
0
0 0

MDSs report slow metadata IOs

by burcarjo＠gmail.com

I have a cephFS 13.2.6 setup composed by 3 OSDs nodes + 1 MDS + 1 monitor. All the nodes are working with CentOS Linux release 7.6.1810 (Core) When writing multiple lot of files (500MB) then the "ls" command of this directory works very slow from an external client (if I list the directory from the same client node that is writing the files then the operation returns immediately): [cephuser@stor2demo ~]$ time ll -h /mnt/cephfs/dir1/dir2 real 2m40.246s user 0m0.002s sys 0m0.003s The Ceph logs shows this information: [cephuser@stor1demo ~]$ ceph health detail HEALTH_WARN 1 MDSs report slow metadata IOs; 1 MDSs report slow requests; 1/10269 objects misplaced (0.010%) MDS_SLOW_METADATA_IO 1 MDSs report slow metadata IOs mdsstor1demo(mds.0): 5 slow metadata IOs are blocked > 30 secs, oldest blocked for 34 secs MDS_SLOW_REQUEST 1 MDSs report slow requests mdsstor1demo(mds.0): 2 slow requests are blocked > 30 secs OBJECT_MISPLACED 1/10269 objects misplaced (0.010%) Why this behaviour ? Thanks.

4 years, 7 months

2
2
0 0

Re: ceph-volume lvm create leaves half-built OSDs lying around

by Jan Fajerski

On Wed, Sep 11, 2019 at 11:17:47AM +0100, Matthew Vernon wrote: >Hi, > >We keep finding part-made OSDs (they appear not attached to any host, >and down and out; but still counting towards the number of OSDs); we >never saw this with ceph-disk. On investigation, this is because >ceph-volume lvm create makes the OSD (ID and auth at least) too early in >the process and is then unable to roll-back cleanly (because the >bootstrap-osd credential isn't allowed to remove OSDs). > >As an example (very truncated): > >Running command: /usr/bin/ceph --cluster ceph --name >client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring >-i - osd new 20cea174-4c1b-4330-ad33-505a03156c33 >Running command: vgcreate --force --yes >ceph-9d66ec60-c71b-49e0-8c1a-e74e98eafb0e /dev/sdbh > stderr: Device /dev/sdbh not found (or ignored by filtering). > Unable to add physical volume '/dev/sdbh' to volume group >'ceph-9d66ec60-c71b-49e0-8c1a-e74e98eafb0e'. >--> Was unable to complete a new OSD, will rollback changes >--> OSD will be fully purged from the cluster, because the ID was generated >Running command: ceph osd purge osd.828 --yes-i-really-mean-it > stderr: 2019-09-10 15:07:53.396528 7fbca2caf700 -1 auth: unable to find >a keyring on >/etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: >(2) No such file or directory > stderr: 2019-09-10 15:07:53.397318 7fbca2caf700 -1 monclient: >authenticate NOTE: no keyring found; disabled cephx authentication >2019-09-10 15:07:53.397334 7fbca2caf700 0 librados: client.admin >authentication error (95) Operation not supported > >This is annoying to have to clear up, and it seems to me could be >avoided by either: > >i) ceph-volume should (attempt to) set up the LVM volumes &c before >making the new OSD id >or >ii) allow the bootstrap-osd credential to purge OSDs > >i) seems like clearly the better answer...? Agreed. Would you mind opening a bug report on https://tracker.ceph.com/projects/ceph-volume. I have found other situation where a roll-back is working as it should, though not with as much impact as this. > >Regards, > >Matthew > >_______________________________________________ >ceph-users mailing list >ceph-users(a)lists.ceph.com >http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Jan Fajerski Senior Software Engineer Enterprise Storage SUSE Software Solutions Germany GmbH (HRB 247165, AG München) Geschäftsführer: Felix Imendörffer

4 years, 7 months

2
1
0 0

subscriptions from lists.ceph.com now on lists.ceph.io?

by Matthias Ferdinand

Hi, sorry to disturb you with list admin stuff. I haven't received any new ceph-users@ mail after August, 28th (I was subscribed to daily digest). lists.ceph.com now is defunct, mails are bounced. I hope that is only temporary, because most of my annotated ceph bookmarks point to lists.ceph.com... From the website it looks like the list was moved to ceph.io. I don't remember reading any announcement of this move (might be my fault). Tried re-subscribing at lists.ceph.io, but it says I am already subscribed. Tried logging in to check preferences, but my old password from lists.ceph.com does not work anymore. Created a new account, logged in and to me the subscription settings look ok. Can you help me here? Maybe it is just the digests that do not work? Please answer to me directly, as I am currently not receiving any list messages. Thank you Matthias Ferdinand

4 years, 7 months

2
1
0 0

Re: regurlary 'no space left on device' when deleting on cephfs

by Burkhard Linke

Hi, do you use hard links in your workload? The 'no space left on device' message may also refer to too many stray files. Strays are either files that are to be deleted (e.g. the purge queue), but also files which are deleted, but hard links are still pointing to the same content. Since cephfs does not use an indirect layer between inodes and data, and the data chunks are named after the inode id, removing the original file will leave stray entries since cephfs is not able to rename the underlying rados objects. There are 10 hidden directories for stray files, and given a maximum size of 100.000 entries you can store only up to 1 million entries. I don't know exactly how entries are distributed among the 10 directories, so the limit may be reached earlier for a single stray directory. The performance counters contains some values for stray, so they are easy to check. The daemonperf output also shows the current value. The problem of the upper limit of directory entries was solved by directory fragmentation, so you should check whether fragmentation is allowed in your filesystem. You can also try to increase the upper directory entry limit, but this might lead to other problems (too large rados omap objects....). Regards, Burkhard -- Dr. rer. nat. Burkhard Linke Bioinformatics and Systems Biology Justus-Liebig-University Giessen 35392 Giessen, Germany Phone: (+49) (0)641 9935810

4 years, 7 months

3
3
0 0

2024

2023

2022

2021

2020

2019

ceph-users September 2019